Just to talk about a different direction here for a second:
Something that I find to be a frustrating side effect of malware issues like this is that it seems to result in well-intentioned security teams locking down the data in apps.
The justification is quite plausible -- in this case WhatsApp messages were being stolen! But the thing is... that if this isn't what they steal they'll steal something else.
Meanwhile locking down those apps so the only apps with a certain signature can read from your WhatsApp means that if you want to back up your messages or read them for any legitimate purpose you're now SOL, or reliant on a usually slow, non-automatable UI-only flow.
I'm glad that modern computers are more secure than they have been, but I think that defense in depth by locking down everything and creating more silos is a problem of its own.
I agree with this, just to note for context though: This (or rather the package that was forked) is not a wrapper of any official WhatsApp API or anything like that, it poses as a WhatsApp client (WhatsApp Web), which the author reverse engineered the protocol of.
So users go through the same steps as if they were connecting another client to their WhatsApp account, and the client gets full access to all data of course.
From what I understand WhatsApp is already fairly locked down, so people had to resort to this sort of thing – if WA had actually offered this data via a proper API with granular permissions, there might have been a lower chance of this happening.
I could certainly see the value in this in principle but sadly the labyrinthine mess that is the Apple permission system (in which they learned none of the lessons of early UAC) illustrates the kind of result that seems to arise from this.
A great microcosm illustration of this is automation permission on macOS right now: there's a separate allow dialog for every single app. If you try to use a general purpose automation app it needs to request permission for every single app on your computer individually the first time you use it. Having experienced that in practice it... absolutely sucks.
At this point it makes me feel like we need something like an async audit API. Maybe the OS just tracks and logs all of your apps' activity and then:
1) You can view it of course.
2) The OS monitors for deviations from expected patterns for that app globally (kinda like Microsoft's SmartScreen?)
3) Your own apps can get permission to read this audit log if you want to analyze it your own way and/or be more secure. If you're more paranoid maybe you could use a variant that kills an app in a hurry if it's misbehaving.
Sadly you can't even implement this as a third party thing on macOS at this point because the security model prohibits you from monitoring other apps. You can't even do it with the user's permission because tracing apps requires you to turn SIP off.
> Maybe the OS just tracks and logs all of your apps' activity
The problem here, is that like so many social-media apps, the first thing the app will do is scrape as much as it possibly can from the device, lest it lose access later, at which point auditing it and restricting its permissions is already too late.
Give an inch, and they’ll take a mile. Better to make them justify every millimetre instead.
We're not in 1980 anymore. Most people need zero, and even power users need at most one or two apps that need that full access to the disk.
In macOS, for example, the sandbox and the file dialog already allow opening any file, bundle or folder on the disk. I haven't really come across any app that does better browsing than this dialog, but if there's any, it should be a special case. Funny enough, WhatsApp on iOS is an app that reimplements the photo browser, as a dark pattern to force users to either give full permission to photos or suffer.
The only time where the OS file dialog becomes limited is when a file is actually "multiple files". Which is 1) solvable by bundles or folders and 2) a symptom of developers not giving a shit about usability.
I think you miss understood. If the OS becomes the arbiter of what can and cannot be accessed; it's a slippery slope to the OS becoming a walled garden that only approved apps and developers are allowed to operate. Of course that is a pretty large generalization, but we already see it with mobile devices and are starting to see it with windows and Mac OS.
I don't think we should be handing more power to OS makers and away from users. There has to be a middle ground between wall gardens and open systems. It would be much better for node & npm to come up with a solution than locking down access.
Meanwhile locking down those apps so the only apps with a certain signature can read from your WhatsApp means that if you want to back up your messages or read them for any legitimate purpose you're now SOL, or reliant on a usually slow, non-automatable UI-only flow.
...and this gives them more control, so they can profit from it. Corporate greed knows no bounds.
I'm glad that modern computers are more secure than they have been
I'm not. Back when malware was more prevalent among the lower class, there was also far more freedom and interoperability.
I don't really know what I'm doing, but. Why couldn't messages be stored encrypted on a blockchain with a system where both user's in a one-one conversation agree to a key, or have their own keys, that grants permission for 'their' messages. And then you'd never be locked into a private software / private database / private protocol. You could read your messages at any point with your key.
At this point, the existence of these attacks should be an expected outcome. (It should have been expected even without the empirical record we now have and the multiple times that we can now cite.)
NPM and NPM-style package managers that are designed to late-fetch dependencies just before build-time are already fundamentally broken. They're an end-run around the underlying version control system, all in favor of an ill-considered, half-baked scheme to implement an alternative approach to version control of the package manager project maintainers' devising.
And they provide cover for attacks like this, because they encourage a culture where, because one's dependencies are all "over there", the massive surface area gets swept under the rug and they never get reviewed (because 56K NPM users can't be wrong).
I am slowly waking up to the realization that we (software engineers) are laughably bad at security. I used to think that it was only NPM (I have worked a lot in this ecosystem over the years), but I have found this to be essentially everywhere: NPM is a poster child for this because of executable scripts on install, but every package manager essentially boils down to "Install this thing by name, no security checks". Every ecosystem I touch now (apart from gamedev, but only because I roll everything myself there by choice) has this - e.g Cargo has a lot of "tools" that you install globally so that you get some capability (like flamegraphs, asm output, test runners etc.) - this is the same vulnerability, manifesting slightly differently. Like others have pointed out, it is common to just pull random Docker images via Helm charts. It is also common to get random "utility" tools during builds in CI/CD pipelines, just by curl-ing random URLs of various "release archives". You don't even have to look too hard - this is surface level in pretty much every company, almost every industry (I have my doubts about the security theatre in some, but I have no first hand experience, so cannot say)
The issue I have is that I don't really have a good idea for a solution to this problem - on one hand, I don't expect everyone to roll the entire modern stacks by hand every time. Killing collaborative software development seems like literally throwing the baby out with the bath water. On the other hand, I feel like nothing I touch is "secure" in any real sense - the tick boxes are there, and they are all checked, but I don't think a single one of them really protects me against anything - most of the time, the monster is already inside the house.
>The issue I have is that I don't really have a good idea for a solution to this problem - on one hand, I don't expect everyone to roll the entire modern stacks by hand every time. Killing collaborative software development seems like literally throwing the baby out with the bath water.
Is NPM really collaborative? People just throw stuff out there and you can pick it up. It's the least commons denominator of collaboration.
The thing that NPM is missing is trust and trust doesn't scale to 1000x dependencies.
I am a big fan of Bazel and have explored Nix (although, regrettably not used it in anger quite yet) - both seem like good steps in the right direction and something I would love to see more usage/evolution of. However, it is important to recognize that these tools have a steep learning curve and require deep knowledge in more than one aspect in order to be used effectively/at all.
Speed of development and development experience are not metrics to be minimized/discarded lightly. If you were to start a company/product/project tomorrow, a lot of the things you want to be doing in the beginning are not related to these tools. You probably, most of the time, want to be exploring your solution space. Creating a development and CI/CD environment that can fully take advantage of these tools capabilities (like hermeticity and reproducibility) is not straightforward - in most cases setting up, scaling and maintaining these often requires a whole team with knowledge that most developers won't have. You don't want to gatekeep the writing of new software behind such requirements. But I do agree that the default should be closer to this, than what we have today. How we get there - now that is the million dollar question.
Something that I keep thinking about is spec driven design.
If, for code, there is a parallel "state" document with the intent behind each line of code, each function
And in conjunction that state document is connected to a "higher layer of abstraction" document (recursively up as needed) to tie in higher layers of intent
Such a thing would make it easier to surface weird behavior imo, alongside general "spec driven design" perks. More human readable = more eyes, and potential for automated LLM analysis too.
I'm not sure it'd be _Perfect_, but I think it'd be loads better than what we've got now
IMO the solution is auditing. We should be auditing every single version of every single dependency before we use it. Not necessarily personally, but we could have a review system like Ebay/Uber/AirBnB and require N trusted reviews.
This is the way. But people read it, nod their heads, and then go back to yolo'ing dependencies into their project without reading them. Culture change is needed.
I agree with much of what you said here, but is it really just about the package manager? If I had specified this repo's git url with a specific version number or sha directly in my package.json, the outcome would be just about the same. And so that's not really an end-run around version control at that point. Even with npm out of the picture the problem is still there.
> If I had specified this repo's git url with a specific version number or sha directly in my package.json[…] that's not really an end-run around version control at that point
Yes it is. Git doesn't operate based on package.json.
You're still trying to devise a scheme where, instead of Git tracking the source code of what you're building and deploying and/or turning into a release, you're excluding parts of that content from Git's purview. That's doing an end-run around the VCS.
This is an npm package that allows you to interact with WhatsApp using their API. The OS wouldn’t prevent this as it’s not interacting with your WhatsApp on your machine, but rather logging you in via a skillfully made 3rd party interface, that unfortunately happens to also be evil.
There are so many package managers out there for different platforms. I feel like there should be some more general, standardized, package manager that is language agnostic. Something that:
- has some guarantees about dependencies
- has some guarantees about provenance (only allow if signed by x, y, z kind of thing)
- has a standardized api so corporate or third party curation of packages is possible (I want my own company package manager that I curate)
- does ????
I don't know, it just seems like every tech area has these problems and I honestly don't understand why there aren't more 'standardized' solutions here
> They're an end-run around the underlying version control system
I assume by "underlying version control system" you mean apt, rpm, homebrew and friends? They don't solve this problem either. Nobody in the opensource world is auditing code for you. Compromised xz still made it into apt. Who knows how many other packages are compromised in a similar way?
Also, apt and friends don't solve the problem that npm, cargo, pip and so on solve. I'm writing some software. I want to depend on some package X at version Y (eg numpy, serde, react, whatever). I want to use that package, at that version, on all supported platforms. Debian. Ubuntu. Redhat. MacOS. And so on. Try and do that using the system package manager and you're in a world of hurt. "Oh, your system only has official packages for SDL2, not SDL3. Maybe move your entire computer to an unustable branch of ubuntu to fix it?" / "Yeah, we don't have that python package in homebrew. Maybe you could add it and maintain it yourself?" / "New ticket: I'm trying to run your software in gentoo, but it only has an earlier version of dependency Y."
No, other trusted repositories are legitimately better because the maintainers built the software themselves. They don't purely rely on binaries from the original developer.
It's not perfect and bad things still make it through, but just look at your example - XZ. This never made it into Debian stable repositories and it was caught remarkably quickly. Meanwhile, we have NPM vulnerability after vulnerability.
But do they audit the code? I say mostly no. They grab the source, try to compile it. Develop patches to fix problems on the specific platform. Once it works, passes the tests, it's done. Package created, added to the repo.
Even OpenBSD, famous for auditing their code, doesn't audit packages. Only the base system.
nix is designed to support many versions of your dependencies on the same system by building a hash of your dependency graph and using that as a kind of dependency namespace for the various applications you have installed. The result is that you can run many versions of whatever application you want on the same system.
> Nobody in the opensource world is auditing code for you
That's still true of nix. Whether you should trust a package is on you. But nix solves everything else listed here.
> I want to use that package, at that version, on all supported platforms...
Nix derivations will fail to build if their contents rely on the FHS (https://refspecs.linuxfoundation.org/FHS_3.0/fhs/index.html), so if a package tries to blindly trust that `/bin/bash` is in fact a compatible version of what you think it is, it won't make it into the package set. So we can each package our a bash script, and instead of running on "bash" each will run on the precise version of bash that we packaged with it. This goes for everything though, compilers, linkers, interpreters, packages that you might otherwise have installed with pip or npm or cargo... nix demands a hash for it up front. It could still have been malicious the whole time, but it can't suddenly become malicious at a later date.
> ... Debian. Ubuntu. Redhat. MacOS. And so on. Try and do that using the system package manager and you're in a world of hurt.
If you're on NixOS, nix is your system package manager. If you're not, you can still install nix and use it on all of those platforms (not Windows, certain heroic folk are working on that, WSL works though)
> Oh, your system only has official packages for SDL2, not SDL3. Maybe move your entire computer to an unustable branch of ubuntu to fix it?"
I just installed SDL3, nix put it in `/nix/store/yla09kr0357x5khlm8ijkmfm8vvzzkxb-sdl3-3.2.26`. Then I installed SDL2, nix put it in `/nix/store/a5ybsxyliwbay8lxx4994xinr2jw079z-sdl2-compat-2.32.58` If I want one or the other at different times, nix will add or remove those from my path. I just have to tell nix which one I want...
$ nix shell nixpkgs#sdl2-compat
$ # now I have sdl2
$ exit
$ nix shell nixpkgs#sdl3
$ # now I have sdl3
> "Yeah, we don't have that python package in homebrew. Maybe you could add it and maintain it yourself?"
All of the major languages have some kind of foo2nix adapter package. When I want to use a python package that's not in nixpkgs, I use uv2nix and nix handles enforcing package sanity on them (i.e. maps uv.lock, a python thing, into flake.lock, a nix thing). I've been dabbling with typescript lately, so I'm using pnpm2nix to map typescript libraries in a similar way.
The learning curve is no joke, but if you climb it, only the hard problems will remain (deciding if the package is malicious in the first place).
Also, you'll have a new problem. You'll be forever cursed to watch people shoot themselves in the foot with inferior packaging, you'll know how to help them, but they'll turn you down with a variant of "that looks too unfamiliar, I'm going to stick with this thing that isn't working".
I think you missed the mark a bit here. This wasn’t a dependency that was compromised, it was a dep that was malicious from the start. Package manager doesn’t really play into this. Even if this package was vendored the outcome would have been the same.
No, package manager actually DOES play into this. Or, rather, the way best practices it enforces do. I would be seriously surprised if debian shipped malware, because the package manager is configured with debian repos by default and you know you can trust these to have a very strict oversight.
If apt's DNA was to download package binaries straight from Github, then I would blame it on the package manager for making it so inherently easy to download malware, wouldn't I?
yeah i mean this is a tough problem. unless you work for a government contractor where they have strict security policies, most devs are just going to run npm install without a second thought as there are a lot of packages.
i dont know what the solution here is other than stop using npm
> i dont know what the solution here is other than stop using npm
Personally I think we need to start adding capability based systems into our programming languages. Random code shouldn't have "ambient authority" to just do anything on my computer with the same privileges as me. Like, if a function has this signature:
function add(a: int, b: int) -> int
Then it should only be able to read its input, and return any integer it wants. But it shouldn't get ambient authority to access anything else on my computer. No network access. No filesystem. Nothing.
Philosophically, I kind of think of it like function arguments and globals. If I call a function foo(someobj), then function foo is explicitly given access to someobj. And it also has access to any globals in my program. But we generally consider globals to be smelly. Passing data explicitly is better.
But the whole filesystem is essentially available as a global that any function, anywhere, can access. With full user permissions. I say no. I want languages where the filesystem itself (or a subset of it) can be passed as an argument. And if a function doesn't get passed a filesystem, it can't access a filesystem. If a function isn't passed a network socket, it can't just create one out of nothing.
I don't think it would be that onerous. The main function would get passed "the whole operating system" in a sense - like the filesystem and so on. And then it can pass files and sockets and whatnot to functions that need access to that stuff.
If we build something like that, we should be able to build something like npm but where you don't need to trust the developers of 3rd party software so much. The current system of trusting everyone with everything is insane.
I couldn't agree with you more, the thing is our underlying security models are
protecting systems from their users, but do nothing for protecting user data from the programs they run. Capability based security model will fix that.
Only on desktop. Mobile has this sorted. Programs have access to their own files unrestricted, and then can access the shared file space only through the users specifically selecting them.
I think there's 2 kinds of systems we're talking about here:
1. Capabilities given to a program by the user. Eg, "This program wants to access your contacts. Allow / deny". But everything within a program might still have undifferentiated access. This requires support from the operating system to restrict what a program can do. This exists today in iOS and Android.
2. Capabilities within a program. So, if I call a function in a 3rd party library with the signature add(int, int), it can't access the filesystem or open network connections or access any data thats not in its argument list. Enforcing this would require support from the programming language, not the operating system. I don't know of any programming languages today which do this. C and Rust both fail here, as any function in the program can access the memory space of the entire program and make arbitrary syscalls.
Application level permissions are a good start. But we need the second kind of fine-grained capabilities to protect us from malicious packages in npm, pip and cargo.
I would also say there is a 3rd class, which are distributed capabilities.
When you look at a mobile program such as the GadgetBridge which is synchronizing data between a mobile device and a watch, and number of permissions it requires
like contacts, bluetooth pairing, notifications, yadda yadda the list goes on.
Systems like E-Lang wouldn't bundle all these up into a single application. Your watch would have some capabilities, and those would interact directly with capabilities on the phone. I feel like if you want to look at our current popular mobile OS's as capability systems the capabilities are pretty coarse grained.
One thing I would add about compilers, npm, pip, cargo. Is that compilers are transformational programs, they really only need read and write access to a finite set of input, and output. In that sense, even capabilities are overkill because honestly they only need the bare minimum of IO, a batch processing system could do better than our mainstream OS security model.
Ironically, any c++ app I've written on windows does exactly this. "Are you sure you want to allow this program to access networking?" At least the first time I run it.
Yeah, but if that app was built using a malicious dependency that only relied on the same permissions the app already uses, you’d just click “Yes” and move on and be pwned.
The issue with npm is JS doesn't have a stdlib, so developers need to rely on npm and third party libs even for things stdlib provide in languages like Java, Python, Go, ...
And if you use bun or nodejs, you also have out of the box access to an HTTP server, filesystem APIs, gzip, TLS and more. And if you're working in a browser, almost everything in jquery has since been pulled into the browser too. Eg, document.querySelector.
Of course, web frameworks like react aren't part of the standard library in JS. Nor should they be.
What more do you want JS to include by default? What do java, python and go have in their standard libraries that JS is missing?
> When people say "js doesn't have a stdlib" they mean "js doesn't have a robust general purpose stdlib like C++ ...
It does though! The JS stdlib even includes an entire wasm runtime. Its huge!
Seriously. I can barely think of any features in the C++ stdlib that are missing from JS. There's a couple - like JS is missing std::priority_queue. But JS has soooo much stuff that C++ is missing. Its insane.
It's more work and more restrictive I suppose. Any business is free to set up jfrog Artifactory and only allow the installation of approved dependencies. And anyone can pull Ironbank images I believe.
Yes, but it requires people. Typically, you identify a package you want (or a new version of a package you want) and you send off a request to a separate security team. They analyze and approve, and the package becomes available in your internal package manager. But this means 1) you need that team of people to do that work, and 2) there's a lot of hurry-up-and-wait involved.
Every docker image specified in a k8s yml or docker-compose file or github action that doesn’t end in :sha256@<hash> (ie specifying a label) is one “docker push” away from a compromise, given that tags/labels are not cryptographically specified. You’re just trusting DockerHub and the publisher (or anyone with their creds) to not rug you.
The industry runs on a lot more unexamined trust than people think.
They’re deployed automatically by machine, which definitionally can’t even give it a second thought. The upstream trust is literally specified in code, to be reused constantly automatically. You could get owned in your sleep without doing anything just because a publisher got phished one day.
Requiring the use of lockfiles and strict adherence to checking updates, also helps. I tend to use dependencies for many things, but ones I've trusted over a long time, I know how they work, often chosen because of how they were implemented, so I can see the updates and review them myself. Scaling up to a team, you make that part of the process whenever you add a new dependencies, and someone's name always have to be "assigned" to a dependency, so people take ownership of the code that gets added. Often people figure out it's not worth it, and figure out a simpler way.
I have to trust the publisher, otherwise I can't update and I have to update because CVE's exist. If we step back, how do I even know that the image blessed with hardcoded hash (doublechecked with the website of whoever is supposed to publish it) isn't backdored now?
Because it has been out and published and used for weeks/months. The longer an artifact is public and in use, the less chance it has of being malicious.
I've worked in plenty of javascript shops and unfortunately its not so far off the mark. Its quite common to see JS projects with thousands of transitive dependencies. I've seen the same in python too.
It's funny how Py has less of this reputation just because the package manager is so broken that you might have a hard time adding so many deps in the first place. (Maybe fixed with uv, but that's relatively new and not default.)
If one relies on the JS ecosystem to put food on the table and can't realistically make changes at their job to mitigate this, short of developing on a second airgapped work-only computer what can developers do to at least partially mitigate the risk? I've heard others mention doing all development in docker containers. Perhaps using a Linux VM?
I was responsible for dev-ops, ci, workstation security at my previous position.
Containerize all of your dev environments and lock dependency files to only resolve to a specific version of a dependency that is known safe.
Never do global installs directly, ideally don't even install node outside of a container.
Lag dependency updates by a couple weeks, and enable automated security scans like dependabot on GH. Do not allow automated updates, and verify every dependency prior to updating.
If you work on anything remotely sensitive, especially crypto adjacent, expect to be a target and use a dedicated workstation that you wipe regularly.
Sounds tedious, but thats the job.
Alternatively you could find a job outside the JS ecosystem, you'll likely get a pay bump too.
I run incus os, which is an operating system that is made for spinning up containers and VMs. Whenever I have to work on a JS project I launch a new container for development and then ssh into it from my laptop. You can also run incus on your computer without installing it as an operating system.
Containers still have some risk since they share the host kernel, but they're a pretty good choice for protection against the types of attacks we see in the JS ecosystem. I'll switch to VM's when we start seeing container escape exploits being published as npm packages :)
When I first started doing development this way it felt like I was being a bit too paranoid, but honestly it's so fast and easy it's not at all noticeable. I often have to work on projects that use outdated package managers and have hundreds of top-level dependencies, so it's worth the setup in my opinion.
Amazing suggestion. So you're running it inside a Docker container or something? I'm going to try this out. I guess the alternative is a VPS if all else fails.
But none of those would have helped in this case, where each dev/user intentionally installed the package specifically so it could retrieve data from the WhatsApp API.
What would have helped is if the dev/user had the ability for the dev/user to confirm before the code connected to a new domain or IP - api.WhatsApp.com? Approve. JoesServer.com or a random IP? Block. Such functionality could be at the OS or Docker level, etc.
I'm aware thanks, but if your company is doing the standard practice of using 10k dependencies for some JS webslop you don't really have any other options but to protect yourself.
isolated-vm (https://www.npmjs.com/package/isolated-vm) here we come for increased sandboxing of node bits and pieces? And we are a year after Java took out the security manager that could sandbox jars in separate classloaders - a standout feature since 1995.
If they really believe their AI is that good and security practices and tooling that solid, why can't they automatically flag this stuff? I am sure they can, but once flagged a human has to check and that seems costly?
> The lotusbail npm package presents itself as a WhatsApp Web API library - a fork of the legitimate @whiskeysockets/baileys package.
> The package has been available on npm for 6 months and is still live at the time of writing.
> (...) malware that steals your WhatsApp credentials, intercepts every message, harvests your contacts, installs a persistent backdoor, and encrypts everything before sending it to the threat actor's server.
That package wasn't any more random than any other NodeJS package. NPM isn't inherently different from, say, Debian repositories, except the latter have oversight and stewardship and scrutiny.
That's what's needed and I am seriously surprised NPM is trusted like it is. And I am seriously surprised developers aren't afraid of being sued for shipping malware to people.
It's really, really not. Just write the libraries yourself. Have a team or two who does that stuff.
And, if you do need a lib because it's too much work, like maybe you have to parse some obscure language, just vendor the package. Read it, test it, make sure it works, and then pin the version. Realistically, you should only have a few dozens packages like this.
This does help. Even before, I was pretty careful about what I used, not just for security but also simplicity. Nowadays it's even easier to LLM-generate utils that one might've installed a dep for in the past.
Well, he didn’t say vibe code. Presumably, you’d still be reviewing the AI code before committing it.
I ran a little experiment recently, and it does take longer than just pulling in npm dependencies, but not that much longer for my particular project: logging, routing, rpc layer with end-to-end static types, database migrations, and so on. It took me a week to build a realistic, albeit simple app with only a few dependencies (Preact and Zod) running on Bun.
Yes, and even more so now that we are vibe coding codebases with piles of random deps that nobody even bothers to look at.
You can mitigate it by fully containerizing your dev env, locking your deps, enabling security scans, and manually updating your deps on a lagging schedule.
Never use npm global deps, pretty much the worst thing you can do in this situation.
Are many of the packages obfuscated? Seems like here the server url was heavily obfuscated and encrypted, that is a big warning flag is it not. Auto scanning a submitted package and flagging off obfuscated / binary payloads / install scripts for further inspection could help. Am wondering how such packages get automatically promoted for distribution ..
If you have to run it regardless, contain it as good as you could, given the potential impact. If you're not using the same machine for anything else, maybe "good riddance" is the way to go? Otherwise try to sandbox it, understanding the tradeoffs and (still) risks. Easiest for now is just run everything in rootless podman containers (or similar), which is relatively easy. Otherwise VMs, or other machines. All depends on what effort you feel is worth it, so really what it is your are protecting.
I had some dependency of a dependency installing crypto miners: it was pretty scary as we have not had this since wordpress. I saw a lot more people having this issue (there is a weird process consuming all my cpu). Like someone here already says: we need an Apache / NPM commons and when packages use anything outside those, big fat alarm bells should chime.
As others pointed out elsewhere, that wouldn’t have helped in this case as presumably it wouldn’t include a WhatsApp API, the purpose of this package. But it could help in general, sure.
NPM show 0 dependents in public packages. The 56k downloads number can easily have been be gamed by automation and therefore not a reliable signal of popularity.
as of this writing, the alleged malware/project is still available on npm and GitHub. I'm surprised koi.ai does not mention in their article if they have reported their findings to npm/GitHub.
Recently audited a software plan created by an AI tool. NPM dependencies as far as the eye can see. I can only imagine the daunting liability concerns if the suggested "engineering" style was actually put forth to be used in production across the wide userbase. That said, the process of the user creating the "draft" codebase gave them a better understanding of scope of work necessary.
I am seriously surprised developers trust NodeJS to this extend and aren't afraid of being sued for inadvertently shipping malware to people.
It's got to be a matter of time, doesn't it, before some software company gets in serious trouble because of that. Or, NPM actually implements some serious stewardship process in place.
This has nothing to do with NodeJS or NPM. The code is freely distributed, just like any open source repo or package manager may provide. The onus is on those who use it to audit what it actually does.
It absolutely does have to do with it. If we continued to ship software libraries like we still do on Linux, then you wouldn't be downloading its releases straight from the source repo, but rather have someone package and maintain them.
Except at the granularity of NodeJS packages, it would be nearly impossible to do.
Malicious libraries will drive more code to be written by LLMs. Currently, malicious libraries seem to be typically trivial libraries. A WhatsApp API library is just on the edge of something that can be vibe coded, and avoiding getting pwned may be a good enough tipping point to embrace NIH syndrome more and more, which I think would be a net negative for F/OSS
The incentives are aligned with the AI models companies, which benefit from using more tokens to code something from scratch
Security issues will simply move to LLM related security holes
Verify what? I certainly don't have the capacity to thoroughly review my every dependency's source code in order to detect potentially hidden malware.
In this case more realistic advice would probably be to either rely on a more popular package to benefit from swarm intelligence, or creating your own implementation.
also scrutinize every dependency you introduce. I have seen sooooo many dependencies over the years where a library was brought in for one or two things which you can write yourself in 5 minutes (e.g. commons-lang to use null-safe string compare or contains only)
Sure but you basically need a different ecosystem to bring in a popular package and not expect to end up with these trivial libraries indirectly through some of the dependencies.
Said scrutinizing from my side consists of checking the number of downloads and age of the package, maybe at best a quick look at the GitHub.
Yes, I'm sure many dependencies aren't very necessary. However, in many projects I worked on (corporate) which were on the older Webpack/Babel/Jest stack, you can expect node_modules at over 1 GB. There this ship has sailed long ago.
But on the upside, most of those packages should be fairly popular. With pnpm's dependency cooldown and whitelisting of postinstall scripts, you are probably good.
It also seems weird that people are only scanning code that breaks?
I have 0 cred in anything security, so maybe i'm just missing a bigger picture thing, but like...if you told me i had to make some sort of malicious NPM package and get people to use it, i'd probably just find something that works, copy the code, put in some stylistic changes, and then bury my malicious code in there?
This seems so obvious that I question if the OP is correct in stating people aren't looking for that, or maybe I misunderstand what they mean because i'm ignorant?
wonder if this is possible with flutter packages or python? im looking to slowly get away from javascript ecosystem.
ive started using Flutter even for web applications as well, works pretty well, still use Astro/React tho for frontend websites so I can't completely get away from it.
The code is literally right there for you. It doesn't matter what ecosystem or package manager. Someone could distribute the same thing anywhere — it's up to those pulling it in to actually start auditing what they're accepting.
JavaScript fanatics will downvote me, but I will say again.
JavaScript is meant to be run in an untrusted environment (think browser), and running it in any form of trusted environment increases the risk drastically [1]
The language is too hard to do a meaningful static analysis.
This particular attack is much harder (though not impossible) to execute in Java, Go, or Rust-based packages.
In what way is it harder to write a library that exfiltrates credentials passed to it in those languages? I’d think it’d be a bit easier because you could use the standard library instead of custom encryption, but otherwise pretty much the same.
> In what way is it harder to write a library that exfiltrates credentials passed to it in those languages?
It is not harder to write.
It is more challenging to execute this attack stealthily.
Due to the myriad behaviors of runtimes (browser vs. backend), frameworks (and their numerous versions), and over-dependency on external dependencies (e.g., leftpad), the risk in JS-based backends increases significantly.
Once again, just having a better supply chain tool, just reviewing the changed packages could mitigate. Maybe hold back some of the dependencies of dependencies would mitigate.
Why aren't more teams putting some tool in-front of their blind-installs from NPM (et al)
Soon the only way to assure your readers that your writing is human is by calling them a motherfucker in the opening sentence.
But then, you'd only be sure that the first sentence was legitimate and not the rest of the article. That is why I constantly reassure my readers that they're some goddamn motherfuckers throughout my writing. And you, too, are one, my friend.
Just to talk about a different direction here for a second:
Something that I find to be a frustrating side effect of malware issues like this is that it seems to result in well-intentioned security teams locking down the data in apps.
The justification is quite plausible -- in this case WhatsApp messages were being stolen! But the thing is... that if this isn't what they steal they'll steal something else.
Meanwhile locking down those apps so the only apps with a certain signature can read from your WhatsApp means that if you want to back up your messages or read them for any legitimate purpose you're now SOL, or reliant on a usually slow, non-automatable UI-only flow.
I'm glad that modern computers are more secure than they have been, but I think that defense in depth by locking down everything and creating more silos is a problem of its own.
I agree with this, just to note for context though: This (or rather the package that was forked) is not a wrapper of any official WhatsApp API or anything like that, it poses as a WhatsApp client (WhatsApp Web), which the author reverse engineered the protocol of.
So users go through the same steps as if they were connecting another client to their WhatsApp account, and the client gets full access to all data of course.
From what I understand WhatsApp is already fairly locked down, so people had to resort to this sort of thing – if WA had actually offered this data via a proper API with granular permissions, there might have been a lower chance of this happening.
See: https://baileys.wiki/docs/intro/
The OS should be mediating such access where it explicitly asks your permission for an app to access data belonging to another publisher.
I could certainly see the value in this in principle but sadly the labyrinthine mess that is the Apple permission system (in which they learned none of the lessons of early UAC) illustrates the kind of result that seems to arise from this.
A great microcosm illustration of this is automation permission on macOS right now: there's a separate allow dialog for every single app. If you try to use a general purpose automation app it needs to request permission for every single app on your computer individually the first time you use it. Having experienced that in practice it... absolutely sucks.
At this point it makes me feel like we need something like an async audit API. Maybe the OS just tracks and logs all of your apps' activity and then:
1) You can view it of course.
2) The OS monitors for deviations from expected patterns for that app globally (kinda like Microsoft's SmartScreen?)
3) Your own apps can get permission to read this audit log if you want to analyze it your own way and/or be more secure. If you're more paranoid maybe you could use a variant that kills an app in a hurry if it's misbehaving.
Sadly you can't even implement this as a third party thing on macOS at this point because the security model prohibits you from monitoring other apps. You can't even do it with the user's permission because tracing apps requires you to turn SIP off.
> Maybe the OS just tracks and logs all of your apps' activity
The problem here, is that like so many social-media apps, the first thing the app will do is scrape as much as it possibly can from the device, lest it lose access later, at which point auditing it and restricting its permissions is already too late.
Give an inch, and they’ll take a mile. Better to make them justify every millimetre instead.
This just sounds like another security nightmare.
We're not in 1980 anymore. Most people need zero, and even power users need at most one or two apps that need that full access to the disk.
In macOS, for example, the sandbox and the file dialog already allow opening any file, bundle or folder on the disk. I haven't really come across any app that does better browsing than this dialog, but if there's any, it should be a special case. Funny enough, WhatsApp on iOS is an app that reimplements the photo browser, as a dark pattern to force users to either give full permission to photos or suffer.
The only time where the OS file dialog becomes limited is when a file is actually "multiple files". Which is 1) solvable by bundles or folders and 2) a symptom of developers not giving a shit about usability.
This sounds great on paper, but what happens when the OS isn't working for the user like Windows?
Switch OS.
I mean this was an app for accessing WhatsApp data, you would approve it and go on... the problem is with it sending data off to a 3rd party.
I think you miss understood. If the OS becomes the arbiter of what can and cannot be accessed; it's a slippery slope to the OS becoming a walled garden that only approved apps and developers are allowed to operate. Of course that is a pretty large generalization, but we already see it with mobile devices and are starting to see it with windows and Mac OS.
I don't think we should be handing more power to OS makers and away from users. There has to be a middle ground between wall gardens and open systems. It would be much better for node & npm to come up with a solution than locking down access.
The arbiter of what can be accessed should be the user, and always the user. The OS should be merely the enforcer.
Currently OSs are a free-for-all, where the user must blindly trust third-party apps, or they enforce it clumsily like in macOS.
This was fine in 1980 but isn't anymore.
Windows is dead
MacOS does this. It has a popup to grant access to folders like documents.
I'm pretty sure WhatsApp does this for anti-competitive reasons not security reasons.
xkcd covers this really well: https://xkcd.com/2044/
Meanwhile locking down those apps so the only apps with a certain signature can read from your WhatsApp means that if you want to back up your messages or read them for any legitimate purpose you're now SOL, or reliant on a usually slow, non-automatable UI-only flow.
...and this gives them more control, so they can profit from it. Corporate greed knows no bounds.
I'm glad that modern computers are more secure than they have been
I'm not. Back when malware was more prevalent among the lower class, there was also far more freedom and interoperability.
> Back when malware was more prevalent among the lower class, there was also far more freedom and interoperability.
Yeah, “the lower class” had the freedom of having their IM accounts hacked and blast spam/fraud messages to all contacts all the time. How nostalgic.
I imagine the average HN commenter seeing every new story being posted and thinking "how could I criticise big tech using this"
I don't really know what I'm doing, but. Why couldn't messages be stored encrypted on a blockchain with a system where both user's in a one-one conversation agree to a key, or have their own keys, that grants permission for 'their' messages. And then you'd never be locked into a private software / private database / private protocol. You could read your messages at any point with your key.
I hope we all have the same change in sentiment about AI in 3 years from now!
At this point, the existence of these attacks should be an expected outcome. (It should have been expected even without the empirical record we now have and the multiple times that we can now cite.)
NPM and NPM-style package managers that are designed to late-fetch dependencies just before build-time are already fundamentally broken. They're an end-run around the underlying version control system, all in favor of an ill-considered, half-baked scheme to implement an alternative approach to version control of the package manager project maintainers' devising.
And they provide cover for attacks like this, because they encourage a culture where, because one's dependencies are all "over there", the massive surface area gets swept under the rug and they never get reviewed (because 56K NPM users can't be wrong).
I am slowly waking up to the realization that we (software engineers) are laughably bad at security. I used to think that it was only NPM (I have worked a lot in this ecosystem over the years), but I have found this to be essentially everywhere: NPM is a poster child for this because of executable scripts on install, but every package manager essentially boils down to "Install this thing by name, no security checks". Every ecosystem I touch now (apart from gamedev, but only because I roll everything myself there by choice) has this - e.g Cargo has a lot of "tools" that you install globally so that you get some capability (like flamegraphs, asm output, test runners etc.) - this is the same vulnerability, manifesting slightly differently. Like others have pointed out, it is common to just pull random Docker images via Helm charts. It is also common to get random "utility" tools during builds in CI/CD pipelines, just by curl-ing random URLs of various "release archives". You don't even have to look too hard - this is surface level in pretty much every company, almost every industry (I have my doubts about the security theatre in some, but I have no first hand experience, so cannot say)
The issue I have is that I don't really have a good idea for a solution to this problem - on one hand, I don't expect everyone to roll the entire modern stacks by hand every time. Killing collaborative software development seems like literally throwing the baby out with the bath water. On the other hand, I feel like nothing I touch is "secure" in any real sense - the tick boxes are there, and they are all checked, but I don't think a single one of them really protects me against anything - most of the time, the monster is already inside the house.
>The issue I have is that I don't really have a good idea for a solution to this problem - on one hand, I don't expect everyone to roll the entire modern stacks by hand every time. Killing collaborative software development seems like literally throwing the baby out with the bath water.
Is NPM really collaborative? People just throw stuff out there and you can pick it up. It's the least commons denominator of collaboration.
The thing that NPM is missing is trust and trust doesn't scale to 1000x dependencies.
I think the solution is a build system that requires version pinning - options include Nix, Bazel, and Buck.
I am a big fan of Bazel and have explored Nix (although, regrettably not used it in anger quite yet) - both seem like good steps in the right direction and something I would love to see more usage/evolution of. However, it is important to recognize that these tools have a steep learning curve and require deep knowledge in more than one aspect in order to be used effectively/at all.
Speed of development and development experience are not metrics to be minimized/discarded lightly. If you were to start a company/product/project tomorrow, a lot of the things you want to be doing in the beginning are not related to these tools. You probably, most of the time, want to be exploring your solution space. Creating a development and CI/CD environment that can fully take advantage of these tools capabilities (like hermeticity and reproducibility) is not straightforward - in most cases setting up, scaling and maintaining these often requires a whole team with knowledge that most developers won't have. You don't want to gatekeep the writing of new software behind such requirements. But I do agree that the default should be closer to this, than what we have today. How we get there - now that is the million dollar question.
Back in the days of Makefilea and autoconf, we tended to require specific versions and would document that in the readme.
Something that I keep thinking about is spec driven design.
If, for code, there is a parallel "state" document with the intent behind each line of code, each function
And in conjunction that state document is connected to a "higher layer of abstraction" document (recursively up as needed) to tie in higher layers of intent
Such a thing would make it easier to surface weird behavior imo, alongside general "spec driven design" perks. More human readable = more eyes, and potential for automated LLM analysis too.
I'm not sure it'd be _Perfect_, but I think it'd be loads better than what we've got now
IMO the solution is auditing. We should be auditing every single version of every single dependency before we use it. Not necessarily personally, but we could have a review system like Ebay/Uber/AirBnB and require N trusted reviews.
This is the way. But people read it, nod their heads, and then go back to yolo'ing dependencies into their project without reading them. Culture change is needed.
I agree with much of what you said here, but is it really just about the package manager? If I had specified this repo's git url with a specific version number or sha directly in my package.json, the outcome would be just about the same. And so that's not really an end-run around version control at that point. Even with npm out of the picture the problem is still there.
> If I had specified this repo's git url with a specific version number or sha directly in my package.json[…] that's not really an end-run around version control at that point
Yes it is. Git doesn't operate based on package.json.
You're still trying to devise a scheme where, instead of Git tracking the source code of what you're building and deploying and/or turning into a release, you're excluding parts of that content from Git's purview. That's doing an end-run around the VCS.
The root problem is the OS allows npm packages to grab your WhatsApp messages without the user knowing.
This is an npm package that allows you to interact with WhatsApp using their API. The OS wouldn’t prevent this as it’s not interacting with your WhatsApp on your machine, but rather logging you in via a skillfully made 3rd party interface, that unfortunately happens to also be evil.
There are so many package managers out there for different platforms. I feel like there should be some more general, standardized, package manager that is language agnostic. Something that: - has some guarantees about dependencies - has some guarantees about provenance (only allow if signed by x, y, z kind of thing) - has a standardized api so corporate or third party curation of packages is possible (I want my own company package manager that I curate) - does ????
I don't know, it just seems like every tech area has these problems and I honestly don't understand why there aren't more 'standardized' solutions here
> They're an end-run around the underlying version control system
I assume by "underlying version control system" you mean apt, rpm, homebrew and friends? They don't solve this problem either. Nobody in the opensource world is auditing code for you. Compromised xz still made it into apt. Who knows how many other packages are compromised in a similar way?
Also, apt and friends don't solve the problem that npm, cargo, pip and so on solve. I'm writing some software. I want to depend on some package X at version Y (eg numpy, serde, react, whatever). I want to use that package, at that version, on all supported platforms. Debian. Ubuntu. Redhat. MacOS. And so on. Try and do that using the system package manager and you're in a world of hurt. "Oh, your system only has official packages for SDL2, not SDL3. Maybe move your entire computer to an unustable branch of ubuntu to fix it?" / "Yeah, we don't have that python package in homebrew. Maybe you could add it and maintain it yourself?" / "New ticket: I'm trying to run your software in gentoo, but it only has an earlier version of dependency Y."
Hell. Utter hell.
No, other trusted repositories are legitimately better because the maintainers built the software themselves. They don't purely rely on binaries from the original developer.
It's not perfect and bad things still make it through, but just look at your example - XZ. This never made it into Debian stable repositories and it was caught remarkably quickly. Meanwhile, we have NPM vulnerability after vulnerability.
But do they audit the code? I say mostly no. They grab the source, try to compile it. Develop patches to fix problems on the specific platform. Once it works, passes the tests, it's done. Package created, added to the repo.
Even OpenBSD, famous for auditing their code, doesn't audit packages. Only the base system.
> I assume by "underlying version control system" you mean apt, rpm, homebrew and friends
No. Git.
...unless your system package manager is nix.
What is so special about nix that it avoids all these issues?
nix is designed to support many versions of your dependencies on the same system by building a hash of your dependency graph and using that as a kind of dependency namespace for the various applications you have installed. The result is that you can run many versions of whatever application you want on the same system.
Unless someone is vetting code, nothing.
> Nobody in the opensource world is auditing code for you
That's still true of nix. Whether you should trust a package is on you. But nix solves everything else listed here.
> I want to use that package, at that version, on all supported platforms...
Nix derivations will fail to build if their contents rely on the FHS (https://refspecs.linuxfoundation.org/FHS_3.0/fhs/index.html), so if a package tries to blindly trust that `/bin/bash` is in fact a compatible version of what you think it is, it won't make it into the package set. So we can each package our a bash script, and instead of running on "bash" each will run on the precise version of bash that we packaged with it. This goes for everything though, compilers, linkers, interpreters, packages that you might otherwise have installed with pip or npm or cargo... nix demands a hash for it up front. It could still have been malicious the whole time, but it can't suddenly become malicious at a later date.
> ... Debian. Ubuntu. Redhat. MacOS. And so on. Try and do that using the system package manager and you're in a world of hurt.
If you're on NixOS, nix is your system package manager. If you're not, you can still install nix and use it on all of those platforms (not Windows, certain heroic folk are working on that, WSL works though)
> Oh, your system only has official packages for SDL2, not SDL3. Maybe move your entire computer to an unustable branch of ubuntu to fix it?"
I just installed SDL3, nix put it in `/nix/store/yla09kr0357x5khlm8ijkmfm8vvzzkxb-sdl3-3.2.26`. Then I installed SDL2, nix put it in `/nix/store/a5ybsxyliwbay8lxx4994xinr2jw079z-sdl2-compat-2.32.58` If I want one or the other at different times, nix will add or remove those from my path. I just have to tell nix which one I want...
> "Yeah, we don't have that python package in homebrew. Maybe you could add it and maintain it yourself?"All of the major languages have some kind of foo2nix adapter package. When I want to use a python package that's not in nixpkgs, I use uv2nix and nix handles enforcing package sanity on them (i.e. maps uv.lock, a python thing, into flake.lock, a nix thing). I've been dabbling with typescript lately, so I'm using pnpm2nix to map typescript libraries in a similar way.
The learning curve is no joke, but if you climb it, only the hard problems will remain (deciding if the package is malicious in the first place).
Also, you'll have a new problem. You'll be forever cursed to watch people shoot themselves in the foot with inferior packaging, you'll know how to help them, but they'll turn you down with a variant of "that looks too unfamiliar, I'm going to stick with this thing that isn't working".
I think you missed the mark a bit here. This wasn’t a dependency that was compromised, it was a dep that was malicious from the start. Package manager doesn’t really play into this. Even if this package was vendored the outcome would have been the same.
No, package manager actually DOES play into this. Or, rather, the way best practices it enforces do. I would be seriously surprised if debian shipped malware, because the package manager is configured with debian repos by default and you know you can trust these to have a very strict oversight.
If apt's DNA was to download package binaries straight from Github, then I would blame it on the package manager for making it so inherently easy to download malware, wouldn't I?
> I think you missed the mark a bit here. This wasn’t a dependency that was compromised, it was a dep that was malicious from the start.
You're making assumptions that I am making assumptions, but I wasn't making assumptions. I understand the attack.
> Package manager doesn’t really play into this.
It does, for the reasons I described.
> the kind of dependency developers install without a second thought
Kind of a terrifying statement, right there.
yeah i mean this is a tough problem. unless you work for a government contractor where they have strict security policies, most devs are just going to run npm install without a second thought as there are a lot of packages.
i dont know what the solution here is other than stop using npm
> i dont know what the solution here is other than stop using npm
Personally I think we need to start adding capability based systems into our programming languages. Random code shouldn't have "ambient authority" to just do anything on my computer with the same privileges as me. Like, if a function has this signature:
Then it should only be able to read its input, and return any integer it wants. But it shouldn't get ambient authority to access anything else on my computer. No network access. No filesystem. Nothing.Philosophically, I kind of think of it like function arguments and globals. If I call a function foo(someobj), then function foo is explicitly given access to someobj. And it also has access to any globals in my program. But we generally consider globals to be smelly. Passing data explicitly is better.
But the whole filesystem is essentially available as a global that any function, anywhere, can access. With full user permissions. I say no. I want languages where the filesystem itself (or a subset of it) can be passed as an argument. And if a function doesn't get passed a filesystem, it can't access a filesystem. If a function isn't passed a network socket, it can't just create one out of nothing.
I don't think it would be that onerous. The main function would get passed "the whole operating system" in a sense - like the filesystem and so on. And then it can pass files and sockets and whatnot to functions that need access to that stuff.
If we build something like that, we should be able to build something like npm but where you don't need to trust the developers of 3rd party software so much. The current system of trusting everyone with everything is insane.
I couldn't agree with you more, the thing is our underlying security models are protecting systems from their users, but do nothing for protecting user data from the programs they run. Capability based security model will fix that.
Only on desktop. Mobile has this sorted. Programs have access to their own files unrestricted, and then can access the shared file space only through the users specifically selecting them.
I think there's 2 kinds of systems we're talking about here:
1. Capabilities given to a program by the user. Eg, "This program wants to access your contacts. Allow / deny". But everything within a program might still have undifferentiated access. This requires support from the operating system to restrict what a program can do. This exists today in iOS and Android.
2. Capabilities within a program. So, if I call a function in a 3rd party library with the signature add(int, int), it can't access the filesystem or open network connections or access any data thats not in its argument list. Enforcing this would require support from the programming language, not the operating system. I don't know of any programming languages today which do this. C and Rust both fail here, as any function in the program can access the memory space of the entire program and make arbitrary syscalls.
Application level permissions are a good start. But we need the second kind of fine-grained capabilities to protect us from malicious packages in npm, pip and cargo.
I would also say there is a 3rd class, which are distributed capabilities.
When you look at a mobile program such as the GadgetBridge which is synchronizing data between a mobile device and a watch, and number of permissions it requires like contacts, bluetooth pairing, notifications, yadda yadda the list goes on.
Systems like E-Lang wouldn't bundle all these up into a single application. Your watch would have some capabilities, and those would interact directly with capabilities on the phone. I feel like if you want to look at our current popular mobile OS's as capability systems the capabilities are pretty coarse grained.
One thing I would add about compilers, npm, pip, cargo. Is that compilers are transformational programs, they really only need read and write access to a finite set of input, and output. In that sense, even capabilities are overkill because honestly they only need the bare minimum of IO, a batch processing system could do better than our mainstream OS security model.
> No network access. No filesystem. Nothing.
Ironically, any c++ app I've written on windows does exactly this. "Are you sure you want to allow this program to access networking?" At least the first time I run it.
I also rarely write/run code for windows.
Yeah, but if that app was built using a malicious dependency that only relied on the same permissions the app already uses, you’d just click “Yes” and move on and be pwned.
Oh, I don't npm.
If I can't yum (et.al.) install it I absolutely review the past major point releases for an hour and do my research on the library.
Is there any guarantee that yum (et. al.) packages are audited?
The issue with npm is JS doesn't have a stdlib, so developers need to rely on npm and third party libs even for things stdlib provide in languages like Java, Python, Go, ...
Sure it does. The JS standard library these days is huge. Its way bigger than C, Zig and Rust. It includes:
- Random numbers
- Timezones, date formatting
- JSON parsing & serialization
- Functional programming tools (map, filter, reduce, Object.fromEntries, etc)
- TypedArrays
And if you use bun or nodejs, you also have out of the box access to an HTTP server, filesystem APIs, gzip, TLS and more. And if you're working in a browser, almost everything in jquery has since been pulled into the browser too. Eg, document.querySelector.
Of course, web frameworks like react aren't part of the standard library in JS. Nor should they be.
What more do you want JS to include by default? What do java, python and go have in their standard libraries that JS is missing?
When people say "js doesn't have a stdlib" they mean "js doesn't have a robust general purpose stdlib like C++ or ${LANGUAGE_ID_RATHER_BE_USING}."
But of course it fucking doesn't because it's a scripting language for the web. It has what it needs, and to do that it doesn't need much.
> When people say "js doesn't have a stdlib" they mean "js doesn't have a robust general purpose stdlib like C++ ...
It does though! The JS stdlib even includes an entire wasm runtime. Its huge!
Seriously. I can barely think of any features in the C++ stdlib that are missing from JS. There's a couple - like JS is missing std::priority_queue. But JS has soooo much stuff that C++ is missing. Its insane.
JS has a stdlib, so to say. See nodejs, and Web standard.
And no programming language's stdlib includes e. g. WhatsApp API libraries
> unless you work for a government contractor where they have strict security policies
... So you're saying there is a blueprint for mitigating this already, and it just isn't followed?
It's more work and more restrictive I suppose. Any business is free to set up jfrog Artifactory and only allow the installation of approved dependencies. And anyone can pull Ironbank images I believe.
Yes, but it requires people. Typically, you identify a package you want (or a new version of a package you want) and you send off a request to a separate security team. They analyze and approve, and the package becomes available in your internal package manager. But this means 1) you need that team of people to do that work, and 2) there's a lot of hurry-up-and-wait involved.
> Yes, but it requires people.
I've heard rumor of a few 100k people laid off in tech over the past few years that might be interested.
Whose gonna pay for it? The companies that laid off those people? They'll just continue on without worrying.
Every docker image specified in a k8s yml or docker-compose file or github action that doesn’t end in :sha256@<hash> (ie specifying a label) is one “docker push” away from a compromise, given that tags/labels are not cryptographically specified. You’re just trusting DockerHub and the publisher (or anyone with their creds) to not rug you.
The industry runs on a lot more unexamined trust than people think.
They’re deployed automatically by machine, which definitionally can’t even give it a second thought. The upstream trust is literally specified in code, to be reused constantly automatically. You could get owned in your sleep without doing anything just because a publisher got phished one day.
That's one reason I barely use any dependencies. I'm forced to use a couple, but I tend to "roll my own," quite a bit.
Well, I should qualify that. I do use quite a few dependencies, but they are ones that I wrote.
Requiring the use of lockfiles and strict adherence to checking updates, also helps. I tend to use dependencies for many things, but ones I've trusted over a long time, I know how they work, often chosen because of how they were implemented, so I can see the updates and review them myself. Scaling up to a team, you make that part of the process whenever you add a new dependencies, and someone's name always have to be "assigned" to a dependency, so people take ownership of the code that gets added. Often people figure out it's not worth it, and figure out a simpler way.
That sounds like a great policy.
I have to trust the publisher, otherwise I can't update and I have to update because CVE's exist. If we step back, how do I even know that the image blessed with hardcoded hash (doublechecked with the website of whoever is supposed to publish it) isn't backdored now?
Because it has been out and published and used for weeks/months. The longer an artifact is public and in use, the less chance it has of being malicious.
Pinning a GitHub Actions action doesn't prevent the action itself from doing an apt install, npm install or running a Docker image that is not pinned.
It's terrifying because it's true for a majority of developers.
It's also hyperbole
I've worked in plenty of javascript shops and unfortunately its not so far off the mark. Its quite common to see JS projects with thousands of transitive dependencies. I've seen the same in python too.
It's funny how Py has less of this reputation just because the package manager is so broken that you might have a hard time adding so many deps in the first place. (Maybe fixed with uv, but that's relatively new and not default.)
Is there no Apache Commons for Javascript? It'd be nice to have a large library from a 'trusted' group.
https://en.wikipedia.org/wiki/Apache_Commons
No matter how large the library, it won't include WhatsApp's API
If one relies on the JS ecosystem to put food on the table and can't realistically make changes at their job to mitigate this, short of developing on a second airgapped work-only computer what can developers do to at least partially mitigate the risk? I've heard others mention doing all development in docker containers. Perhaps using a Linux VM?
I was responsible for dev-ops, ci, workstation security at my previous position.
Containerize all of your dev environments and lock dependency files to only resolve to a specific version of a dependency that is known safe.
Never do global installs directly, ideally don't even install node outside of a container.
Lag dependency updates by a couple weeks, and enable automated security scans like dependabot on GH. Do not allow automated updates, and verify every dependency prior to updating.
If you work on anything remotely sensitive, especially crypto adjacent, expect to be a target and use a dedicated workstation that you wipe regularly.
Sounds tedious, but thats the job.
Alternatively you could find a job outside the JS ecosystem, you'll likely get a pay bump too.
I run incus os, which is an operating system that is made for spinning up containers and VMs. Whenever I have to work on a JS project I launch a new container for development and then ssh into it from my laptop. You can also run incus on your computer without installing it as an operating system.
Containers still have some risk since they share the host kernel, but they're a pretty good choice for protection against the types of attacks we see in the JS ecosystem. I'll switch to VM's when we start seeing container escape exploits being published as npm packages :)
When I first started doing development this way it felt like I was being a bit too paranoid, but honestly it's so fast and easy it's not at all noticeable. I often have to work on projects that use outdated package managers and have hundreds of top-level dependencies, so it's worth the setup in my opinion.
I'm waiting for container escapes too, its only a matter of time.
Haven't seen any in the wild, but i built a few poc's just to prove to myself that I wasn't being overly paranoid.
Amazing suggestion. So you're running it inside a Docker container or something? I'm going to try this out. I guess the alternative is a VPS if all else fails.
I do the same thing. Podman all the things.
But none of those would have helped in this case, where each dev/user intentionally installed the package specifically so it could retrieve data from the WhatsApp API.
What would have helped is if the dev/user had the ability for the dev/user to confirm before the code connected to a new domain or IP - api.WhatsApp.com? Approve. JoesServer.com or a random IP? Block. Such functionality could be at the OS or Docker level, etc.
Some companies mandate that npm packages have to be x months old. Which gives time for this stuff to be discovered.
If you're distributing something that uses this package, it's not just your dev computer at risk, it's all the users.
I'm aware thanks, but if your company is doing the standard practice of using 10k dependencies for some JS webslop you don't really have any other options but to protect yourself.
isolated-vm (https://www.npmjs.com/package/isolated-vm) here we come for increased sandboxing of node bits and pieces? And we are a year after Java took out the security manager that could sandbox jars in separate classloaders - a standout feature since 1995.
NPM was a mistake.
We created a minefield of abandonware and called it an ecosystem.
Or perhaps it was all by design?
Microsoft either needs to become a better steward of NPM or hand it off to a foundation that can properly maintain it.
Good plan - I'm sure they'll get right on it after solving the virus and malware issues on their mainline OS.
If they really believe their AI is that good and security practices and tooling that solid, why can't they automatically flag this stuff? I am sure they can, but once flagged a human has to check and that seems costly?
It probably could. But I assume with even a small amount of indirection added it no longer would be able to pick it up.
There is no AI, it's all a scam.
> The lotusbail npm package presents itself as a WhatsApp Web API library - a fork of the legitimate @whiskeysockets/baileys package.
> The package has been available on npm for 6 months and is still live at the time of writing.
> (...) malware that steals your WhatsApp credentials, intercepts every message, harvests your contacts, installs a persistent backdoor, and encrypts everything before sending it to the threat actor's server.
Is there an increasing trend of supply chain attacks? What can developers do to mitigate the impact?
Mitigate? Stop using random packages. Prevent? Stop using NPM and similar package ecosystems altogether.
That package wasn't any more random than any other NodeJS package. NPM isn't inherently different from, say, Debian repositories, except the latter have oversight and stewardship and scrutiny.
That's what's needed and I am seriously surprised NPM is trusted like it is. And I am seriously surprised developers aren't afraid of being sued for shipping malware to people.
"NPM isn't inherently different from, say, Debian repositories, except the latter have oversight and stewardship and scrutiny"
Yeah thats the entire point.
> NPM isn't inherently different from, say, Debian repositories, except the latter have oversight and stewardship and scrutiny.
Which when compared to NPM, which has no meaningful controls of any sort, is an enormous difference.
> and similar package ecosystems altogether
Realistically, this is impossible.
It's really, really not. Just write the libraries yourself. Have a team or two who does that stuff.
And, if you do need a lib because it's too much work, like maybe you have to parse some obscure language, just vendor the package. Read it, test it, make sure it works, and then pin the version. Realistically, you should only have a few dozens packages like this.
at some point having LLMs spit out libraries for you might be safer than actually downloading them.
or just vendor your deps like we have been doing for decades.
This does help. Even before, I was pretty careful about what I used, not just for security but also simplicity. Nowadays it's even easier to LLM-generate utils that one might've installed a dep for in the past.
LLMs will happily copy-paste malware or add them as dependencies
this kicks the can down the road until we get supply chain attacks through LLM poisoning, like we already do with propaganda
Well, he didn’t say vibe code. Presumably, you’d still be reviewing the AI code before committing it.
I ran a little experiment recently, and it does take longer than just pulling in npm dependencies, but not that much longer for my particular project: logging, routing, rpc layer with end-to-end static types, database migrations, and so on. It took me a week to build a realistic, albeit simple app with only a few dependencies (Preact and Zod) running on Bun.
Does this happen with CPAN?
At least they seemed to have policies:
https://security.metacpan.org/
Yes, and even more so now that we are vibe coding codebases with piles of random deps that nobody even bothers to look at.
You can mitigate it by fully containerizing your dev env, locking your deps, enabling security scans, and manually updating your deps on a lagging schedule.
Never use npm global deps, pretty much the worst thing you can do in this situation.
Review and vendor your dependencies like it’s 1999.
Are many of the packages obfuscated? Seems like here the server url was heavily obfuscated and encrypted, that is a big warning flag is it not. Auto scanning a submitted package and flagging off obfuscated / binary payloads / install scripts for further inspection could help. Am wondering how such packages get automatically promoted for distribution ..
If you have to run it regardless, contain it as good as you could, given the potential impact. If you're not using the same machine for anything else, maybe "good riddance" is the way to go? Otherwise try to sandbox it, understanding the tradeoffs and (still) risks. Easiest for now is just run everything in rootless podman containers (or similar), which is relatively easy. Otherwise VMs, or other machines. All depends on what effort you feel is worth it, so really what it is your are protecting.
use dependabot with cooldown.
I had some dependency of a dependency installing crypto miners: it was pretty scary as we have not had this since wordpress. I saw a lot more people having this issue (there is a weird process consuming all my cpu). Like someone here already says: we need an Apache / NPM commons and when packages use anything outside those, big fat alarm bells should chime.
As others pointed out elsewhere, that wouldn’t have helped in this case as presumably it wouldn’t include a WhatsApp API, the purpose of this package. But it could help in general, sure.
So is there a list of the most popular apps that made use of the infected lotusbail npm package?
NPM show 0 dependents in public packages. The 56k downloads number can easily have been be gamed by automation and therefore not a reliable signal of popularity.
as of this writing, the alleged malware/project is still available on npm and GitHub. I'm surprised koi.ai does not mention in their article if they have reported their findings to npm/GitHub.
Did any other scanner catch this, and when? A detection lag leaderboard would be neat.
Recently audited a software plan created by an AI tool. NPM dependencies as far as the eye can see. I can only imagine the daunting liability concerns if the suggested "engineering" style was actually put forth to be used in production across the wide userbase. That said, the process of the user creating the "draft" codebase gave them a better understanding of scope of work necessary.
I am seriously surprised developers trust NodeJS to this extend and aren't afraid of being sued for inadvertently shipping malware to people.
It's got to be a matter of time, doesn't it, before some software company gets in serious trouble because of that. Or, NPM actually implements some serious stewardship process in place.
This has nothing to do with NodeJS or NPM. The code is freely distributed, just like any open source repo or package manager may provide. The onus is on those who use it to audit what it actually does.
It absolutely does have to do with it. If we continued to ship software libraries like we still do on Linux, then you wouldn't be downloading its releases straight from the source repo, but rather have someone package and maintain them.
Except at the granularity of NodeJS packages, it would be nearly impossible to do.
Was anyone actually affected by this? Is this package a dependency of some popular package?
I assume the answer is no because this is clearly clickbait AI slop but who knows.
Malicious libraries will drive more code to be written by LLMs. Currently, malicious libraries seem to be typically trivial libraries. A WhatsApp API library is just on the edge of something that can be vibe coded, and avoiding getting pwned may be a good enough tipping point to embrace NIH syndrome more and more, which I think would be a net negative for F/OSS
The incentives are aligned with the AI models companies, which benefit from using more tokens to code something from scratch
Security issues will simply move to LLM related security holes
Popularity is never a metric for security or quality….Always verify.
Verify? Verify what?
Verify what? I certainly don't have the capacity to thoroughly review my every dependency's source code in order to detect potentially hidden malware.
In this case more realistic advice would probably be to either rely on a more popular package to benefit from swarm intelligence, or creating your own implementation.
also scrutinize every dependency you introduce. I have seen sooooo many dependencies over the years where a library was brought in for one or two things which you can write yourself in 5 minutes (e.g. commons-lang to use null-safe string compare or contains only)
Sure but you basically need a different ecosystem to bring in a popular package and not expect to end up with these trivial libraries indirectly through some of the dependencies.
Said scrutinizing from my side consists of checking the number of downloads and age of the package, maybe at best a quick look at the GitHub.
Yes, I'm sure many dependencies aren't very necessary. However, in many projects I worked on (corporate) which were on the older Webpack/Babel/Jest stack, you can expect node_modules at over 1 GB. There this ship has sailed long ago.
But on the upside, most of those packages should be fairly popular. With pnpm's dependency cooldown and whitelisting of postinstall scripts, you are probably good.
But... GitHub stars!
Over a certain popularity it is. 56k downloads is nowhere near the threshold.
> 56k Downloads?
That seems ..low..?
It also seems weird that people are only scanning code that breaks?
I have 0 cred in anything security, so maybe i'm just missing a bigger picture thing, but like...if you told me i had to make some sort of malicious NPM package and get people to use it, i'd probably just find something that works, copy the code, put in some stylistic changes, and then bury my malicious code in there?
This seems so obvious that I question if the OP is correct in stating people aren't looking for that, or maybe I misunderstand what they mean because i'm ignorant?
Feels almost SEO. 56k used to be the top speed for models. It was L33t.
wonder if this is possible with flutter packages or python? im looking to slowly get away from javascript ecosystem.
ive started using Flutter even for web applications as well, works pretty well, still use Astro/React tho for frontend websites so I can't completely get away from it.
The code is literally right there for you. It doesn't matter what ecosystem or package manager. Someone could distribute the same thing anywhere — it's up to those pulling it in to actually start auditing what they're accepting.
PyPI has had compromised or fake packages in the past.
yes it is possible with rust, python, php, and likely many others
JavaScript fanatics will downvote me, but I will say again. JavaScript is meant to be run in an untrusted environment (think browser), and running it in any form of trusted environment increases the risk drastically [1]
The language is too hard to do a meaningful static analysis. This particular attack is much harder (though not impossible) to execute in Java, Go, or Rust-based packages.
1 - https://ashishb.net/tech/javascript/
Even in a browser, a compromised JS payload can put your user's data and privacy at risk.
> Even in a browser, a compromised JS payload can put your user's data and privacy at risk.
True. In a backend, however, a compromised payload can put all of user's and your non-user data at risk.
> your non-user data at risk.
That sounds like a GDPR fine waiting to be issued right there.
In what way is it harder to write a library that exfiltrates credentials passed to it in those languages? I’d think it’d be a bit easier because you could use the standard library instead of custom encryption, but otherwise pretty much the same.
> In what way is it harder to write a library that exfiltrates credentials passed to it in those languages?
It is not harder to write. It is more challenging to execute this attack stealthily.
Due to the myriad behaviors of runtimes (browser vs. backend), frameworks (and their numerous versions), and over-dependency on external dependencies (e.g., leftpad), the risk in JS-based backends increases significantly.
Almost need to run each npm package isolated to the extent possible, or something equivalent.
Once again, just having a better supply chain tool, just reviewing the changed packages could mitigate. Maybe hold back some of the dependencies of dependencies would mitigate.
Why aren't more teams putting some tool in-front of their blind-installs from NPM (et al)
Wow that AI art looks terrible.
Lots of signs of AI writing also: “not this, but that” constructions everywhere. The first paragraph in Final Thoughts is pure ChatGPT.
It’s hard to read any blog anymore without trying to work out which part is actually from a human.
Soon the only way to assure your readers that your writing is human is by calling them a motherfucker in the opening sentence.
But then, you'd only be sure that the first sentence was legitimate and not the rest of the article. That is why I constantly reassure my readers that they're some goddamn motherfuckers throughout my writing. And you, too, are one, my friend.
We’ve got a bonified human right here motherfuckers
> Traditional security doesn't catch this.
> const backdoorCode = crypto.AES.decrypt( "U2FsdGVkX1+LgFmBqo3Wg0zTlHXoebkTRtjmU0cq9Fs=", "ERROR_FILE" ).toString(crypto.enc.Utf8);
Really? Isn't random garbage string pretty strong indication of someone doing something suspicious?