Sign in as anyone: Bypassing SAML SSO authentication with parser differentials

(github.blog)

312 points | by campuscodi 4 months ago ago

130 comments

asmor 4 months ago
GitHub's SAML implementation is useless. The idea is that you can bring your own account into an enterprise, and that sort of works on the site itself, but it does not prevent apps where you log in with GitHub from reading your organization membership once you have authorized an app on the organization level (and if you didn't, it hides the membership from oauth tokens, so it has this capability!).
A SAML session is only required if said app fetches data via a token obtained from that user - and in my glance around, this was almost never the case - SAST tools almost always use app instance tokens and are happy to show anyone with a GitHub account in your organization your code. Tailscale fixed this when I pointed it out, Sonarcloud told me to please don't tell anyone and GitHub took a few weeks to say this is totally expected behavior - when no vendor I told did, and their docs contradicted them.
I swear, reporting security bugs is a thankless endeavor, even if you just randomly stumble over them. I couldn't imagine doing this as a job.
[-]
- eCa 4 months ago
  > The idea is that you can bring your own account into an enterprise
  The issues goes beyond authorization. I’ve had Github randomly once in a blue use my personal email address as the default when merging a work PR. If anyone asks, I advice against mixing personal and professional stuff in the same Github account (or anywhere).
  [-]
  - MortyWaves 4 months ago
    This is exactly why I’m so paranoid about account and device separation.
    I don’t even trust Git profiles. I buy a new license for GitKraken at any job I go to, even if I could avoid it; to me the possibility of accidentally trying to commit to work GitHub with my personal GitHub or vice verse is not worth it.
    It’s the same with Microsoft accounts and their infamously bad-tech-debt-caused spaghetti.
    Like if you try login to Outlook on iOS and you get a threatening message to the effect of “your system administrator will be able to remotely control and wipe your entire device if you proceed”. If it’s even a possibility that an incompetent or malicious IT department wipes your personal device, then no thank you.
    See also that HN thread where a father let his child use his laptop, where they signed into their Microsoft school account, and somehow his personal Microsoft account was merged into their school account and from what I could tell he was never able to fix it and the school IT department didn’t care.
    [-]
    - soco 4 months ago
      Not only on iOS, also Outlook on Android does the same. Ironically, Teams doesn't try the same trick, so if they have some emergency while I'm away, my folks know to only try sending me stuff in Teams, as Outlook will never be installed here.
    - cyberpunk 4 months ago
      Depends on the org I think now the controls are more fine grained. For example I have teams and outlook on my personal phone and the only thing they can do is delete the apps, taking a screenshot is blank, copy/paste doesn’t work etc.
      [-]
      - deergomoo 4 months ago
        MAM vs MDM. MAM is good when you want your developers to be reachable on Slack but they all (for good reason) refuse to install an MDM on a personal device—at least that seems to be how my employer feels.
  - nextts 4 months ago
    Yeah someone said why the funny account name why not use your personal account and I thought "wat are you crazy". And that isn't because of SAML etc. just simple don't mix work and pleasure ethos! I don't use my personal email to send an email to a customer.
    [-]
    - freeopinion 4 months ago
      It seems to be very common to use a personal phone for work 2fa or lots of other workplace tasks. Employers seem mystified if you request a corporate device when you obviously already have your own. I even see this a little with personal vehicles.
      The idea of separating work and personal seems to be becoming old-fashioned.
    - mjevans 4 months ago
      Funny that, exactly why NOWHERE should consider a phone number any form of ID.
      [-]
      - tsimionescu 4 months ago
        Can you elaborate on the connection you see here?
        [-]
        mjevans 4 months ago
        Tying someone's identity to a thing they barely control and find it difficult to get more than one of.
        Particularly something someone might reasonably need 3 or more different instances of. E.G. Personal SemiProfessional, Personal NSFW stuff, Work but they didn't give an X this service demands.
  - l72 4 months ago
    My company does not allow any employees to use their personal GitHub for work (or Facebook, instagram, or anything else) after running into issue when employees leave.
    [-]
    - booi 4 months ago
      Wouldn’t you just remove them from the org?
      [-]
      - onionisafruit 4 months ago
        They may decide to change their github login to <company_name>LIES, and suddenly that’s all over your old PRs and Issues. Including in public repos where customers go looking for help.
        [-]
        TheDong 4 months ago
        That's even more true with a dedicated work github account than a mixed personal/work one; either way they can still login and edit the account name even if removed from the company org, and if it's not shared it doesn't burn their personal account too... right?
        Is this speaking from experience?
        [-]
        wlesieutre 4 months ago
        With a dedicated work account the organization can always take over the account (via reset email if need be, since they own your work email account) and do whatever they want with it
        [-]
        rendaw 4 months ago
        A dedicated work account _where you use your work email address_... that was the missing part throughout this thread.
        But then if you do that you also lose all your open source work history, which is important from a hiring/resume perspective.
        [-]
        cdogl 4 months ago
        One option for those so inclined is to cryptographically sign commits with a key that lists both work and personal email address (assuming your enterprise’s policy allows it). The employer retains control but you have a claim to credit for your work.
        [-]
        tsimionescu 4 months ago
        If we're discussing companies willing to go to lengths to scrub you from their GitHub history, they can still replace all commits you've signed with new commits. You likely have no legal rights to that work, and git does allow you to rewrite history arbitrarily.
        [-]
        withinboredom 4 months ago
        It depends on the jurisdiction. In the US, copyright assignment is usually permanent. In the EU and Canada, you can claw back your rights to a degree and even revoke the usage altogether, if you manage to claw it back because they did "evil" things with it (moral rights).
        In some cases (even in the US), if the employer does something that would be considered a "breach of contract", you can force them to remove all your code as well.
        So, it would not be in the company's best interest to scrub their git history.
        [-]
        tsimionescu 4 months ago
        I think even in the EU and Canada, you don't have any copyright interest in work your perform as part of your employment. The copyright on the work you produce for your employer is entirely theirs, from the moment it is created.
        Now, if you're a contractor performing work for a company, this may be quite different. But as an employee, I don't think you have any claim of authorship to the code you right as part of your job.
        [-]
        withinboredom 4 months ago
        Look up "moral rights." You have some ability to revoke the usage of your work if it violates moral standards.
        [-]
        TheNewsIsHere 4 months ago
        Some countries allow authors to transfer/assign or waive asserting their moral rights. Typically (and sensibly) this must be in writing.
        shiomiru 4 months ago
        > git does allow you to rewrite history arbitrarily.
        Technically yes, but the price is too great - everybody who has cloned the repos will now have to nuke their local copies too.
        [-]
        tsimionescu 4 months ago
        Sure, but the same is true for unsigned commits as well, isn't it? Or can you modify the commit metadata without changing the commit hash in those cases?
        [-]
        shiomiru 4 months ago
        > Sure, but the same is true for unsigned commits as well, isn't it?
        Yes, I think so. As I understand, GP's idea was to sign your commits proactively.
        [-]
        tsimionescu 4 months ago
        My question was, is signing the commits really useful? Isn't it just as hard or easy to scrub you from the repo history regardless of whether the commits are signed or not?
        connicpu 4 months ago
        If a spiteful ex-employer wants to scrub ex-employee authorship from the entire commit history in their public repos when someone leaves I don't think there's anything you could do to stop that either way, though it seems like it would be more trouble than it's worth and likely wouldn't scale. If they don't do that, assuming your old company email address still has your name in it I don't see why you'd lose credit for the work you did.
        OJFord 4 months ago
        And you could still just change it right, as long as you did so before the employer revoked your access via the work email address.
        4 months ago
        [deleted]
  - PokestarFan 4 months ago
    Why not just use the GitHub generated email address you get when you hide your email?
  - TheNewsIsHere 4 months ago
    GitHub is my current go-to example of an individual oriented application with business features shoehorned into it.
    For awhile GitHub was rather unavoidably the only place in my company where there was no reliable line between personal and professional accounts/systems.
    I moved us to Forgejo after trialing it against Github (and GitLab, and Gitea).
    At a prior employer everyone just used their personal GitHub accounts for the business. Once it became a “capital-E-Enterprise” making promises about things like employee SSO, they quickly retreated to an on-premise platform (not GitHub EE).
  - whyever 4 months ago
    Using more than one Github account violates their ToS though.
    [-]
    - asmor 4 months ago
      You may notice the button with the two opposing arrows on top of the user menu. You should click on it.
- peterldowns 4 months ago
  To be fair to the vendors, Github makes it extremely difficult to do the right thing here. I built a repo/commit/pr-analysis tool (https://dev.log.xyz) and it took a lot of effort to make it so that "iff you can see it in Github you can see it in Devlog." The entire experience was beyond frustrating.
  Github also makes their OAuth permissions picker extremely confusing. When I "login with Github" I am never sure exactly what I'm sharing, from which organizations I'm a member of.
  [-]
  - wutwutwat 4 months ago
    > Github makes it extremely difficult to do the right thing here ... it took a lot of effort to make it so that "iff you can see it in Github you can see it in Devlog." The entire experience was beyond frustrating.
    Do they? You don't have to mess with syncing teams, memberships, or assignment to repos if you don't want to. You can make one api call:
    > The authenticated user has explicit permission to access repositories they own, repositories where they are a collaborator, and repositories that they can access through an organization membership.
    https://docs.github.com/en/rest/repos/repos?apiVersion=2022-...
    [-]
    - peterldowns 4 months ago
      Yes, they do make it difficult. Keeping an updated set of {(user, repo, capabilities)} is necessary for correctly implementing "iff you can see it in Github you can see it in Devlog". Polling this REST endpoint user-by-user doesn't really work — both architecturally (are you going to poll this endpoint? how frequently?) and pragmatically (the REST API is both much slower and much less stable than the GraphQL API, and pagination works poorly.)
      I ended up using the GraphQL API and making a query like this:
      query($cursor0: String, $cursor1: String) { search(query:"org:peterldowns", type: REPOSITORY, first: 100, after: $cursor0) { pageInfo { hasNextPage endCursor } repositories: nodes { ... on Repository { id name collaborators(first: 100, after: $cursor1) { pageInfo { hasNextPage endCursor } edges { permission node { login id } } } } } } }
      Removing permissions that are no longer present in this result set is left as an exercise for the reader.
      I stopped working on the product so I never implemented the event stream consumer that would let me listen for "this user was removed as a collaborator" or "this team no longer has admin access to that repo". The entire permissioning model for Github is extremely complex and learning about all of its intricacies was half the battle.
    - asmor 4 months ago
      I should've tested this endpoint. GitHub's SAML implementation is done by a different team, always lags behind in quality and does some pretty unclean patching of the data - i.e. the notification filtering is done in the templating engine, so if all your notifications are SAML gated you get the header, no "all caught up" below it and (this is live from my account) "1-0 of 113".
      So I'd give it about a 50:50 chance of working.
      Edit: I just realized it eats your non-gated notifications too, if they're further down than position 25, and the "Next" button just leads to the same page with "?query=". Yay, another ticket about how glued on GitHub Enterprise Cloud is. The last one (GitHub eats API calls to accept invites to SAML organizations, deletes the invite, and sends a 200, writes success to the audit log... but ends up being a no-op) only has been 2 months or so ago. Thanks Microsoft.
      [-]
      - asmor 4 months ago
        And the results are in.
        "Our engineering team has indicated they will not be able to fully dive in and implement any resolutions in the near-term."
        Considering they didn't in the past 7 years, I have little hope for the next 7.
  - asmor 4 months ago
    Yeah, it's a massive UX issue. The way to actually check if someone has a SAML session is to attempt to get their membership. If you get a 403, there isn't one. But good luck explaining to the user that they need to click "authorize" next to the organization in the OAuth flow. No way to send a hint that it may be required, and no way to do a step-up flow.
    I did a full writeup here: https://notes.acuteaura.net/posts/github-enterprise-security...
- weard_beard 4 months ago
  This is the operating procedure at every conceivable level. You would not believe how difficult it is to convince young developers raised on Javascript that client side validation is not enough, much less the business owners setting out functional requirements and budgets.
  [-]
  - cluckindan 4 months ago
    ”You would not believe how difficult it is to convince young developers raised on Javascript that client side validation is not enough”
    At first read, I think you’re JSplaining, but I’m willing to give you the benefit of the doubt.
    How difficult is it exactly? Can you provide examples, perhaps even of the particular difficulties? Are the difficulties on the side of the convincer or the convincee, or both?
    [-]
    - nextts 4 months ago
      I think it is something they have to experience. Tell them if they are happy with it, give me a $10 bug bounty. Then go hack a deploy of their branch. Then tell em to keep the $10 but remember the lesson.
      [-]
      - fn-mote 4 months ago
        Wow. I would never guess it was so hard to convince someone of this.
        “The code I write doesn’t have XSS or SQL injection vulnerabilities,” sure. At least those are plausible things to believe.
        Client side validation?? How could anybody believe in that?
        [-]
        nine_k 4 months ago
        I convinced fellow engineers who were adamant that the code they had written was OK by writing actual exploits against their code. Twice. Worked both times, without betting on money.
        wglb 4 months ago
        An axiom of secure programming is to never trust the client. You don't really know what the client is.
        Often it takes several penetrations via compromised/replaced clients to get the message through.
        Just look at all the discussions about why browser-based javascript encryption is problematic.
  - dboreham 4 months ago
    [flagged]
    [-]
    - weard_beard 4 months ago
      They'd be fired if they spent a penny more fixing it.
    - UltraSane 4 months ago
      Musk should be fired. He has irreversibly destroyed the Tesla brand.
      [-]
      - woleium 4 months ago
        But has he realized yet?
        [-]
        NewJazz 4 months ago
        I wouldn't be surprised if the recent stock downturn is mostly him illegally offloading shares without disclosing it.
- TZubiri 4 months ago
  the premise of comfort from shared credentials, and perhaps of increased security from sso; breaks down the moment you have a vulnerability like this.
  Any type of password store, even a physical one, or just reusing passwords, ends up being safer.
  Minimalism wins again
- fulafel 4 months ago
  For anyone trying to connect the above to this vuln research, this seems unrelated ("GitHub doesn’t currently use ruby-saml for authentication, but began evaluating the use of the library with the intention of using an open source library for SAML authentication once more")
- 4 months ago
  [deleted]
Diggsey 4 months ago
I recently had to implement SAML and this headline does not surprise me in the slightest.
The SAML spec itself is fairly reasonable, but is built upon XML signatures (and in turn, XML canonicalization) which are truly insane standards, if they can even be called such.
Only a committee could produce such a twisted and depraved specification, no single mind would be capable of holding and combining such contradictory ideas.
It would be so simple to just transmit signatures out-of-band and SAML would be a pleasure to implement.
[-]
- jiggawatts 4 months ago
  It’s much worse than you’re making it sound: XML is literally an eXtensible Markup Language, so… of course the SAML standardisation committee invented their own extension mechanism language on top of it.
  Coming up with your own protocol on top of a protocol for a tiny amount of data amounting to not much more than what’s in an authentication cookie is the special kind of stupid that only the largest and most bureaucratic committees can produce.
  [-]
  - whizzter 4 months ago
    The SAML people aren't the most crazy ones though, the entire XML signature stuff is the insane part (You find it in other places such as xADES,SOAP,etc as well). In some misguided effort to avoid a inner-platform effect they made the decision to embed the signature in the structure itself being signed and thus enable these scenarios (needing canonical XML to verify things).
    For any hate JWT gets, JSON-inside-JSON,etc, at least it architecturally avoids these kinds of security issues since you verify once and only read data from what has been verified and nothing else, instead of having to re-create data and hope that the loose structure doesn't mess things up.
- 4 months ago
  [deleted]
- TZubiri 4 months ago
  Is SSO salvageable at all? It seems like the idea of just logging into different accounts is fine.
  Also just the idea of connecting your accounts together such that you can get megacompromised is foundationally riskier
  [-]
  - unscaled 4 months ago
    SAML is not the only standard for SSO. Before SAML we had Kerberos and nowadays you can use Open ID Connect. Other standards can have their own gotcha, but SAML is uniquely horrendous.
    When we get vulnerabilities in the SSO protocol (SAML or otherwise) these vulnerabilities generally only affect some of the clients (identity consumers) who have implemented the protocol incorrectly or are using a feature that the provider has implemented incorrectly. Vulnerabilities that break the entire provider are less common.
    When comparing this situation to having multiple different accounts, I can't see how SSO is less secure. Sure, when you have breach that affects the entire identity provider the damage is high, but the risk of having a breach (any breach!) is lower, since implementations are fewer, more consolidated and usually developed by people with better expertise.
  - blincoln 4 months ago
    The biggest problem with having separate accounts for everything is that a lot of the users will make their own "wish it were SSO" by setting their passwords in the systems to the same value. Then, when the weakest system is exploited, the attacker gets credentials that are valid across the organization. Yes, they should be using a password manager with unique, random passwords for each system, but realistically a good chunk of larger organizations' staff are not going to do that.
    Some other headaches:
    Having decentralised authentication means that onboarding and offboarding need to have a bunch of tedious manual steps, or custom automation.
    Whoever does user support for the organization has to be trained to reset passwords/unlock accounts in a hodgepodge of systems.
    Any security controls the organization wants to implement need to be reimplemented or approximated in a bunch of different systems. E.g. if there are regulatory requirements for account lockouts, time between explicit reauthentication, etc.
    It becomes much more critical to collect the authentication logs/event data for all of those systems, and harmonize its formatting with everything else so that the security ops team isn't maintaining separate monitoring/alerting rules for every system.
    For large-scale systems, there are also at least theoretically performance advantages to the kind of signed ticket approach that SSO mechanisms tend to use, versus having to do database lookups of session IDs or verify a password. It's possible to do that without SSO, but if you're going to the trouble of implementing that kind of mechanism, you're most of the way to having SSO anyway, and might as well just finish the job IMO.
    [-]
    - semitones 4 months ago
      > Having decentralised authentication means that onboarding and offboarding need to have a bunch of tedious manual steps, or custom automation.
      Furthermore, SAML SSO alone does not save you from worrying about this, ideally you'd also implement SCIM, to have actually automated + real-time identity updates, which is yet another protocol separate from SAML.
    - TZubiri 4 months ago
      "If there are regulatory requirements for account lockouts"
      Then the vendor can impmenent those? The need for control might be a source for greater risk.
    - TZubiri 4 months ago
      Often password rules make slight variations of the passwords so that the damage is limited.
      Furthermore, attackers don't try all accounts since they don't know which ones exist (unless they have email access).
  - thayne 4 months ago
    OIDC is better than SAML, but that isn't a high bar. And OIDC has its own problems.
    [-]
    - tptacek 4 months ago
      OIDC's problems are nothing like those of SAML.
  - gusmd 4 months ago
    One can use OIDC instead of SAML for SSO.
tptacek 4 months ago
SAML (more broadly XML-DSIG) is literally the worst security protocol in common use. I think you should generally be taking whatever hits you need to take to transition from it to OAuth. Certainly, I would refuse to bring a new product to market that relied on it. It's incredibly dangerous. Unless there's some breakthrough in practical formal verification, I can't imagine that this will be the last or the worst DSIG vulnerability.
[-]
- nimish 4 months ago
  One day I will write an essay on all of the incredibly stupid things XML DSig does, and that's not even touching the cryptography. It's peak enterprise software brain.
  Someone should go deep on the mailing list and standards body horrors of WS-* and OASIS/XACML and all that crap
  [-]
  - pushkar2911 4 months ago
    Please write about this. I would love to read it.
- shellcromancer 4 months ago
  Security Cryptography Whatever’s take on this week SAML non-sense will be fun.
  [-]
  - tptacek 4 months ago
    Honestly, hadn't thought of it, but of course we should do that. Thanks!
    [-]
    - janderson215 4 months ago
      I’d sub to infosec rant podcasts. $5/mo Patreon sub for a non-vtuber version.
- akdor1154 4 months ago
  My (possibly misunderstood to the point of misphrasing) understanding, is that SAML still has the point-of-difference that your sso provider can cancel a session. Is that right?
  [-]
  - akerl_ 4 months ago
    Can it? In many SAML setups, there's not direct network interaction between the IDP and Service, other than at most sharing metadata via URL.
  - recursive 4 months ago
    There's an optional feature in the spec I think. But in my very limited experience, it is never implemented or working correctly.
    [-]
    - p_ing 4 months ago
      Microsoft implements this in Azure/M365.
  - Nextgrid 4 months ago
    OIDC has out-of-band backchannel logout.
- securesaml 4 months ago
  [dead]
kayodelycaon 4 months ago
Ugh. No one should use REXML unless they have no other choice. It will happily parse invalid xml, which causes an infinite number of problems downstream.
It’s quite literally parsing xml using regular expressions. It’s an excellent case study for why you shouldn’t do that.
Projects didn’t start using Nokogiri for performance. They used it because it’s correct.
[-]
- LtWorf 4 months ago
  > It’s quite literally parsing xml using regular expressions. It’s an excellent case study for why you shouldn’t do that.
  It's like a textbook example no? Don't parse non-regular languages with regular expressions.
  [-]
  - wavemode 4 months ago
    I would go further: don't parse with regular expressions. At all. They should be used for search (does this text contain something I'm looking for) not parsing (creating a syntax tree from unstructured input).
- mtkd 4 months ago
  One of risks of AI code assistance is that they are not necessarily looking at the wider picture when it comes to libraries used on a large code base
  I was testing o3 recently and it kept changing the library used by a block of code every time it tried to fix an issue in the block that was unrelated to the library used (haven't seen that happen with Sonnet)
  Easy to see how issues could creep in because a modification is made that switches to an inferior library/gem that exists in the code base or standard library so still passes tests etc. but doesn't need a Gemfile change
GauntletWizard 4 months ago
Saml is insecure by design. Others have said it better before me, such as https://joonas.fi/2021/08/saml-is-insecure-by-design/, but the quote I got from an old thread here was "Sign Bytes, not meanings".
Parser differentials are expected and even necessary. What you intend to get from a signed response is very meaningful. A dilemma in modern TLS is that sometimes you want to trust one internal CA; That's the easy path. Sometimes you want to accept a certificate from a partner's CA, and you've got multiple partners - and you can no longer examine just the end certificate, but the root of that chain is equally important in your decisions.
This is also why I recommend whenever possible against AWS Sig algorithms; V4 is theoretically secure, but they screwed it up twice - SigV1 and SigV3 were insecure by design, and yet somehow made it past design review and into the public.
noleary 4 months ago
This is a great write-up.
He's mentioned in the article, but a major shout-out is warranted for ahacker1. He's doing really sophisticated and valuable work to secure SAML implementations. We at SSOReady are really appreciative of his work.
Earlier this week, WorkOS put together a nice write-up on their own collaboration with ahacker1: https://workos.com/blog/samlstorm
derektank 4 months ago
>We discovered an exploitable instance of this vulnerability in GitLab, and have notified their security team
GitLab has released a fix on their end for anyone else wondering
https://about.gitlab.com/releases/2025/03/12/patch-release-g...
RainyDayTmrw 4 months ago
Related: Latacora's (2019) article, How (not) to sign a JSON object[1].
In short, nesting trees and signing them is difficult and prone to pitfalls. It's easier if the envelope holds the message as a raw string, and the signing is performed on the raw string.
[1]: https://www.latacora.com/blog/2019/07/24/how-not-to/
wcoenen 4 months ago
Isn't the simpler conclusion here that one should look for the signature where it is supposed to be? Instead of using an excessively general XPath like "//ds:Signature" that might find any signature in any unexpected location...
[-]
- TZubiri 4 months ago
  I feel most responses to vulnerabilities are so lenient, you have to throw out some baby with the bathwater, you can't surgically remove the dangerous component, you gotta chop and throw chemotherapy en masse.
  If you are an IT admin with any pride, SAML is out of any future plans. The idea of SSO is suspect as a whole. Xml parsing has been hit twice in a week, avoid it in the future, anything wrong with a policy that replaces xml with json?
  [-]
  - unscaled 4 months ago
    > Xml parsing has been hit twice in a week, avoid it in the future, anything wrong with a policy that replaces xml with json?
    OAuth 2.0 and its extension Open ID Connect have been around for over a decade. They have their own gotchas (like in badly defined ID token in OIDC and the ill-thought implicit and hybrid flows), but nothing there is nearly as dangerous as SAML.
    Most applications support Open ID Connect now, but I'm still seeing organization choosing to use SAML out of inertia even when they are fully capable of using Open ID Connect.
  - tptacek 4 months ago
    For an organization of any significant size (say, anything over 10 people), not deploying SSO would be malpractice. The point of SSO is to have a single point of control and a single, mandatory 2FA stack.
    Obviously, if you can avoid doing SSO with SAML, you should.
  - L-four 4 months ago
    Parse this JSON correctly ```json { "data": "XXX", "sig": "BAD", "sig": "GOOD" } ```
    [-]
    - TheDong 4 months ago
      In a security sensitive context, a parser should return an error on a duplicate key regardless what common parsers do and what the RFC fails to specify.
      Implicitly, that means no security software dealing with json should be written in Go, Javascript, ruby, python, etc (where practically everyone uses json parsers that silently ignore duplicate keys)
      Plenty of languages do have common json libraries w/ duplicate key errors, like haskell (aeson), rust (serde_json), java (gson, org.json, probably others), so there's plenty of good choices.
      So yeah, correct parse result is '400 bad request'
      [-]
      - chrome111 4 months ago
        For Java, I think you mean Jackson, not gson, unless something has changed recently. Goes to show that even the behemoths can get this wrong.
        https://github.com/protocolbuffers/protobuf/blob/6aefdde9736...
    - TZubiri 4 months ago
      I overwrite with the last one.
      Strictly not a parser problem.
      Csv is also available.
      And binary protocols, with index based implicit keys are and byte length prepended to variable length fields. Those are the gold standard (see ip and tcp headers.)
- Muromec 4 months ago
  Hot take, but for me the conclusion always was -- get a big stick and use it to prevent web developers from touching anything near your security sensitive code. Starting from design, protocols and data formats of it. The set of habits and design considerations simply doesn't match day to day practice of the usual web development. It's often the opposite of what you need to write normal code.
  [-]
  - TZubiri 4 months ago
    I don't think it's fair to blame the skill of web developers (although if they use javascript and leftpaddings they have it coming).
    The nature of web software is 100 times riskier than anything else because of the risk profiles and 100% connectivity
    [-]
    - chairmansteve 4 months ago
      Anyone who thinks a publically accessible web site is secure is insane.
bawolff 4 months ago
Its kind of annoying to explain the vulnerability in a blog post and then omit the parser differential in question.
It is like writing the introduction to a story and omitting the climax.
[-]
- mdaniel 4 months ago
  The sibling comment's blog post <https://news.ycombinator.com/item?id=43374972> included the relevant detail: they were just doing (...//ds:DigestValue).firstChild.nodeValue without checking that .firstChild was a Node (in the offending case, it was a Comment). Thus, the non-canonical one saw the "masked" signature, the corrected one which tossed out comments saw a Node and when two implementations differ about a signed document hilarity will ensue
  [-]
  - bawolff 4 months ago
    Are you sure that is the one for this blog post? i got the impression that was a different vuln for a different saml implementation.
    Also using comments to bypass saml is very old news. https://duo.com/blog/duo-finds-saml-vulnerabilities-affectin... is a post from 2018 about it.
    [-]
    - mdaniel 4 months ago
      Evidently it's not the same, sorry; it seems that I lept to conclusions with the two signature mismatch vulns by ahacker1 showing up so close to one another but opening the very tiny, very dark, code picture shows this seems to be xpath-centric, not nodeType as the workos link discussed
- blincoln 4 months ago
  I'm guessing they didn't want to be directly responsible for dropping a zero-day that allows authorization bypass in countless systems across the planet before the parties responsible for those systems have a chance to fix them.
  I'm sure the specifics will come out sooner or later.
robmccoll 4 months ago
Don't use SAML, mostly because it uses XMLDSig. Don't use XMLDSig because it's hard to get usefully right and easy to get dangerously wrong.
kinow 4 months ago
BlueSky post with a video showing the vulnerability:https://bsky.app/profile/ulldma.bsky.social/post/3lkbi6rasl2...
fsfsd43535 4 months ago
Interesting vulnerability! It's a classic example of how seemingly small differences in implementation (REXML vs Nokogiri) can lead to significant security holes. Kudos to Peter Stöckli and ahacker1 for finding it!
I wonder how many other libraries are vulnerable to similar parser differential attacks. It's a good reminder to be extremely careful when dealing with XML and SAML, which are complex beasts at the best of times. As asmor pointed out, Github's SAML implementation has other issues too. It seems like SAML is just inherently difficult to get right.
Also, to the person who suggested not mixing personal and professional stuff in the same Github account: wise words! I've seen that cause headaches more than once.
oefrha 4 months ago
I’m aware of the reputation of XML signatures, but it’s the first time I read about technical details, and they make my head spin.
Q: Is there any non-legacy reason to use SAML instead of libsodium’s public key authenticated encryption (crypto_box)?
Another Q: Is there any non-theoretical risk of parser differential when using libsodium’s cyrpto_box on one end and Golang’s x/crypto/nacl/box on the other end?
[-]
- blincoln 4 months ago
  Wouldn't using crypto_box mean the developer would have to implement their own custom authorization mechanism from scratch?
  i.e. it looks like a reasonably good way of exchanging encrypted messages, but I don't see anything in the docs indicating that it would provide the equivalent of group membership/roles/permissions.
  Building something like that as custom code is a huge commitment, and could easily result in severe vulnerabilities specific to that system.
  [-]
  - oefrha 4 months ago
    I was thinking you can strip out the retarded signature protocol in SAML (replace that with libsodium) and leave the actual payload intact, or even switch from XML to a simpler wire format like JSON for the payload. But maybe even the payload part of the standard isn't worth saving, I can't tell after reading a single article about it.
    [-]
    - pas 4 months ago
      Of course, but then you run into the ipv6 problem. How do you upgrade every fucking end-and-middlebox out there? The WAFs the legacy-legacy-sure-we'll-upgrade-this-quarter-oops-next-decade-again boxes that got migrated to the clouuuud (to a private cloud as a Windows Server 2003 VM of course) ...
      Hence the whole "enterprise" IT.
layer8 4 months ago
This is an example of a parser mismatch vulnerability.
Related submission a year ago: https://news.ycombinator.com/item?id=38743029
malkia 4 months ago
Maybe asking stupid question, but would older versions of puppet be affected (like 6?). Also is there a site to check deps down to what maybe affected?
[-]
- TZubiri 4 months ago
  Try dependabot? But it's a tool by github, maybe better not to depend on a self reporylt
oncallthrow 4 months ago
XML is to authentication bypasses what C is to buffer overflow attacks
[-]
- pvg 4 months ago
  You're selling XML short here, it had its own share of straight up RCEs too.
- thayne 4 months ago
  XML could really benefit from a standardized subset that cuts out all the unnecessary features and security footguns.
  [-]
  - Nextgrid 4 months ago
    I find that the "unnecessary features" and footguns are what makes XML, well, XML. I guess there must be some legitimate usage of those, or at least was back in the day. If you strip them out, you'd end up with a JSON-like (so you may as well use JSON).
    [-]
    - thayne 4 months ago
      No, you would have an extensible markup language. And json is not a good fit for markup.
      Now, xml has also been used for a lot of things where a hierarchical format like json would have worked better than a markup format, of which SAML would be a good example. But there are also cases where a markup format makes more sense, like svg or docbook, or odf.
    - dralley 4 months ago
      Or something like RON
      https://github.com/ron-rs/ron
  - thayne 4 months ago
    GMarkup[1] is pretty close to what I had in mind. If only it was more prevalent and had an agreed upon standard.
    [1]: https://docs.gtk.org/glib/markup.html
- dietr1ch 4 months ago
  Sad that XML has too many features for an otherwise somewhat nice, but verbose markup language.
  [-]
  - bawolff 4 months ago
    Some of it isn't explicitly XML's fault (although it doesn't help). SAML and especially XMLSignature are terrible standards even in ways that dont involve xml.
  - treve 4 months ago
    Feature are kind of a negative for security. Imagine if yaml was used!
    [-]
    - alexchamberlain 4 months ago
      I think there is a "safe" subset of both XML and YAML that 80% of people actually use.
      [-]
      - Muromec 4 months ago
        which is exactly the problem. if you have two parsers of the same format in a security context that show slightly different behavior (maybe in the rest 20% or maybe not) it's often enough.
      - bawolff 4 months ago
        From a security perspective that's kind of useless, as your concern is not what the "good" people do, it's what the "bad" people do.
        [-]
        alexchamberlain 4 months ago
        Well, you can define such a subset and write or configure parsers to only use that; I've seen both XML and YAML libraries do just that, by disabling remote file loading or arbitrary code execution for example.
        [-]
        bawolff 4 months ago
        Disabling xml remote entities and billion laughs is a given.
        In the context of saml that's hardly the least of it. Lots of the problems are things like allowing comments to sort of change the meaning of the document, allowing signatures to sign only part of the document. Allowing multiple signatures to sign different parts of the document, etc.
- 4 months ago
  [deleted]
dudeinjapan 4 months ago
I'm working on refactoring RubySaml right now so that it uses pure Nokogiri XML parser, which would have avoided at least one of these CVEs. It's really a mess because the current way things work RubySaml is subclassing REXML::Document, which you can't do in Nokogiri, and in the process I have found 15 year old bugs in JRuby Nokogiri, which the maintainer @flavorjones was very responsive and merged my patch. Anyway, fun times.