This was announced originally early last year. It removes the requirement for TLD and nTLD (not ccTLD) operators to have a WHOIS service available, but doesn't mandate they must shut them down.
So far the sunsetting has had little effect with most TLDs still having their WHOIS services online. In reality, I think we'll see a period of time where many TLDs and nTLDs have both WHOIS and RDAP available.
Additionally, since ccTLD's aren't governed by ICANN, many don't even have an RDAP service available. As such, there's going to be a mix of RDAP and WHOIS in use across the entire internet for some time to come.
Disclosure: I run https://viewdns.info/ and have spent many an hour dealing with both WHOIS and RDAP parsing to make sure that our service returns consistent data (via our web interface and API) regardless of the protocol in use.
I think RDAP is going to be adopted by more and more ccTLDs as well. WHOIS is not a particularly well liked protocol (I was at an IETF meeting where ICANN did a presentation on the timeline and people were literally cheering for the demise of WHOIS).
It's funny to see that a lot of services are finally moving from a human-readable / plain text format towards structured protocols right at the point where we can finally have LLMS parse the unstructured protocols :-)
Well you can't really trust an LLM to give you reproducible output every time, you can't even trust it to be faithful to the input data, so that's nice to have a standard format now. And for like a millionth of the computing resources to parse it. Also Whois was barely human-readable, with the fields all over the place, missing or different from one registry to the other. A welcome change that should have come really sooner.
we can't ever have LLMs reliably parse any form of data. You know what can parse it perfectly though? A parser. Which works perfectly, and consistently.
On non-conformant inputs, a parser will barf and yell at you, which is exactly what you want.
On non-conformant inputs, there's absolutely no telling what an LLM will do, which is precisely the problem. It might barf, or it might blissfully continue, and even if the input was right you couldn't remotely trust it to regurgitate the input verbatim.
As for bugs, it is at least theoretically possible to write a parser with no bugs, whereas an LLM is fundamentally probabilistic.
Of course we can. Reliability is a spectrum, not a binary state. You can push it up however high you like, and stop somewhere between "we don't care about error rate this low" and "error rate is so low it's unlikely to show in practice".
It's not like this is a new concept. There are plenty of algorithms we've been using for decades that are only statistically correct. A perfect example of this is efficient primality testing, which is probabilistic in nature[0], but you can easily make the probability of error as small as "unlikely to happen before heat death of the universe".
There are two problems with this comparison. First, probabilistic prime generation has a mathematically proven lower bound that improves with iteration. There is no comparably robust tuning parameter with an LLM. You can use a different model, you can use a bigger variant of the same model, etc., but these all have empirically determined and contextually sensitive reliability levels that are not otherwise tunable. Second, the prime generation function will always give you an integer, and never an apple, or a bicycle, or a phantasm. LLMs regurgitate and hallucinate, which means that a simple error rate is not the only metric that matters. One must also consider how egregiously wrong and even nonsensical the errors can be.
I think the better statement is that, if, say, you're running the Miller-Rabin test 10 times, you can be confident that an error in one test is uncorrelated with an error in the next test, so it's easy to dial up the accuracy as close to 1 as desired. Whereas with an LLM, correlated errors seem much more likely; if it failed three times parsing the same piece of data, I would have no confidence that the 4th-10th times would have the same accuracy rate as on a fresh piece of data. LLMs seem much more like the Fermat primality test, except that their "Carmichael numbers" are a lot more common.
The general point is not that the feature currently exists to dial down the LLM parse error rate, it’s that the abstract argument “we can’t use LLMs because they aren't perfect” isn’t a realistic argument in the first place. You’re probably reading this on hardware that _probably_ shows you the correct text most all of the time but isn’t guaranteed to.
Precisely this. People dismiss utility of LLMs because they don't give 100% reliability, without considering the basic facts that:
- LLMs != ChatGPT interface, they don't need to be run in isolation, nor do they need to do everything end-to-end.
- There are no 100% reliable systems - neither technological nor social. Voltages fluctuate, radiation flips bit, humans confabulate just as much if not worse than LLMs, etc.
- We create reliability from unreliable systems.
LLMs aren't some magic unreliability pixie dust that makes everything they touch beyond repair. They're just another system with bounded reliability, and can be worked into larger systems just like anything else, and total reliability can be improved through this.
If you job is to be a referent, to have authority. You absolutely don't want to make any error. Pretty safe isn't enough, you need to be absolutely sure that you control the output.
I wouldn't use LLMs, but if I did, I would try to get the LLM to write parser code instead.
If it can convert from one format to another, then it can generate test cases for the parser. Then hopefully it can use those to iterate on parser code until it passes the tests.
In a sense, asking it to automate the work isn't as straightforward as asking it to do the work. But if the approach does pan out, it might be easier overall since it's probably easier to deploy generated code to production (than deploying LLMs).
My desktop GPU can run small models at 185 tokens a second. Larger models with speculative decoding: 50t/s. With a small, finetuned model as the draft model, no, this won't take much power at all to run inference.
Training, sure, but that's buy once cry once.
Whether this means it's a good idea, I don't think so, but the energy usage for parsing isn't why.
A simple text parser would probably be 10,000,000 times as fast. So the statement that this won't take much power at all, is a bit of an overstatement.
50 tokens per second. Compared to a quick and dirty parser written in python or even a regex? That's going to be many many orders of magnitude slower+costlier.
You'll need to provide actual figures and benchmark these against an actual parser.
I've written parsers for larger-scale server stuff. And while I too don't have these benchmarks available, I'll dare to wager quite a lot that a dedicated parser for almost anything will outperform an LLM magnitudes. I won't be suprised if a parser written in rust uses upwards of 10k times less energy than the most efficient LLM setup today. Hell, even a sed/awk/bash monstrosity probably outperforms such an LLM hundreds of times, energy wise.
How many times would you need to parse to get an energy saving on using an lm to parse vs using an llm to write a parser, then using the parser to parse.
It sounds like you need to learn how to program without using a LLM, but even if you used one to write a parser, and it took you 100 requests to do so, you would very quickly get the desired energy savings.
This is the kind of thinking that leads to modern software being slower than software from 30 years ago, even though it is running on hardware that's hundreds of times faster.
People not using The AWK Programming Language as a reference to parse stuff
and maybe The C Programming Language with AWKA (AWK to C translator) and a simple CSP library for threading yeilds a disaster on computing.
LLM's are not the solutions, they are the source of big troubles.
I was thinking more of when a sufficiently advanced device would be able to “decide” the task would be worth using its own capabilities to write some code to tackle the problem rather than brute force.
For small problems it’s not worthwhile, for large problems it is.
It’s similar to choosing to manually do something vs automate it.
I didn't use an LLM back then. But would totally do that today (copilot).
Especially since the parser(s) I wrote were rather straightforward finite state machines with stream handling in front, parallel/async tooling around it, and at the core business logic (domain).
Streaming, job/thread/mutex management, FSM are all solved and clear. And I'm convinced an LLM like copilot is very good at writing code for things that have been solved.
The LLM, however, would get very much in the way in the domain/business layer. Because it hasn't got the statistical body of examples to handle our case.
(Parsers I wrote were a.o.: IBAN, gps-trails, user-defined-calculations (simple math formulas), and a DSL to describe hierarchies. I wrote them in Ruby, PHP, rust and perl.)
It’s not just about the energy usage, but also purchase cost of the GPUs and opportunity cost of not using those GPUs for something more valuable (after you have bought them). Especially if you’re doing this at large scale and not just on a single desktop machine.
Of course you were already saying it’s not a good idea, but I think the above definitely plays a role at scale as well.
My assumption is that models are getting cheaper, fast. So you can build now with OpenAI/Anthropic/etc and swap it out for a local or hosted model in a year.
This doesn't work for all use cases but data extraction is pretty safe. Treat it like a database query -- a slow but high availability and relatively cheap call.
While it will become cheaper, it will never be as fast / efficient as 'just' parsing the data the old-fashioned way.
It feels like using AI to do computing things instead of writing code is just like when we moved to relatively inefficient web technology for front-ends, where we needed beefier systems to get the same performance as we used to have, or when cloud computing became a thing and efficiency / speed became a factor of credit card limit instead of code efficiency.
Call me a luddite but I think as software developers we should do better, reduce waste, embrace mechanical sympathy, etc. Using AI to generate some code is fine - it's just the next step in code generators that I've been using throughout all my career IMO. But using AI to do tasks that can also be done 1000x more efficiently, like parsing / processing data, is going in the wrong direction.
I know this particular problem space well. AI is a reasonable solution. WHOIS records are intentionally made to be human readable and not be machine parseable without huge effort because so many people were scraping them. So the same registrar may return records in a huge range of text formats. You can write code to handle them all if you really want to, but if you are not doing it en masse, AI is going to probably be a cheaper solution.
Example: https://github.com/weppos/whois is a very solid library for whois parsing but cannot handle all servers, as they say themselves. That has fifteen + years of work on it.
I think you’re both right, and also both are missing the point.
Using LLMs to parse whois data is okay in the meantime (preferably as a last resort!), but structuring the data properly in the first place (i.e. RDAP) is the better solution in the long run.
Requesting that people think before transferring mission critical code into the hands of LLMs is not being a Luddite lol.
Can you imagine how many ridiculous errors we would have if LLMs structured data into protobufs. Or if they compiled software.
It's more than 1000x more wasteful resources wise too. The llm swiss army knife is the Balenciaga all leather garbage bag option for a vast majority of use cases
Still, I wouldn't use an LLM for what's essentially a database query: by their very nature, LLMs will give you the right answer most of the times, but will sometimes return you wrong information. Better stay on a deterministic DB query in this case.
As usual, arguments for LLMs are based on rosy assumptions about future trajectory. How about we talk about data extraction at that point in the future when models are already cheap enough. And in the meantime just assume the future is uncertain, as it obviously is.
Off topic thank you for runnig viewdns.info. I don't use it regularly, mainly for the occasional WHOIS information lookup and it has always worked perfectly.
Hey, I've been looking for a tool that can do reverse NS lookup for a nameserver pairs (ie. which domains have nameservers ns1.example.com and ns2.example.com) but all the services out there that I've found can only do one. Is this something you would consider implementing?
It's kind of funny some operators have never had it in practice. For example, .es never had a public whois, and need to register with a national ID (and I think with a fixed IP address) to get access to it.
That need for a national ID hasn't been in place for a long time, AFAIK.
I have a .es (my nickname berkes, domain berk.es) for almost 16 years now, and live in the EU, but not in Spain. In the beginning I used a small company that offered services for non-spanish companies to register .es through them (I believe they technically owned the domains?). But today it's just in my local domain registrar without need for an ID.
That .es has no whois has struck me as somewhat of a benefit actually. Back in the days, it kept away a lot of spam from spammers that'd just lift email-addresses off the whois. My .com, .nl and other domains recieve(d) significant more such spam. Let alone phone-number and other personal details delivered over an efficient, decentralized network. Though recent privacy addons(?) have mitigated that a little.
Usually, the need to use an ID is only for private persons (and usually only if they are nationals). Anyone else should not need that. The general theory is that a nation can only verify data that they themselves have.
Some ccTLD's have rules against registrations by people not located within the country that owns the ccTLD, in which case a valid national id or organization number would be required. From what I can see, .es does not have that requirement.
The concept of WHOIS has felt sleazy for many years.
If I register a domain, the registrar will basically extort me a couple extra dollars per year for “domain privacy” for the privilege of not having my name, home address, phone number, and email publicly available and then mirrored across thousands of shady scraped content sites in perpetuity. Even If you don’t care about that, then begins the never ending emails texts and calls begin from sleazy outfits who want to sell you related domains, do SEO for you, revamp your site, schedule a call, or just fill your spam box up with legitimate scams and bootleg pharma trash.
All because you wanted a $10/year dot com without paying the bribe.
And yes I grew up leafing through well worn phone books next to corded phones. This is not comparable.
To be fair, OP never said this was necessarily related directly to the article.
I’ll often post loosely related tangents like this because I would enjoy discussing the tangent with the HN crowd, but there’s often not a better opportunity to discuss it, so why not while we’re sort of on the topic anyway.
Ack that I don’t think it makes sense to discuss not even remotely related topics. But as long as it’s in the ballpark and it’s not going against other guidelines and leads to interesting discussion, I think it’s fine.
Indeed. Furthermore, the fact that there is still a replacement makes the discussion even more pertinent in this case, since OP is arguing for the abolition of any such protocol.
That was a common racket a long time ago, but pretty much every widely recommended registrar offers free whois privacy now. At least when they're allowed to, some TLDs forbid obfuscating the whois information.
a little less than a year ago, my wife registered a .us domain that she ended up not using at all. she still gets phone calls nearly daily from people trying to sell her web design/dev work
I have two .in domains with namecheap and whois data is all "REDACTED FOR PRIVACY" despite namecheap not allowing me to add domain privacy when I purchased the domains.
I’ve looked into it a bit more, and turns out there are two options for redacting WHOIS data:
- “Privacy service”, which is these funky named LLCs replacing your data in the WHOIS
- Just the redaction, which replaces almost all data with REDACTED FOR PRIVACY (except for registrant's country, state, and organization name).
No idea why or how any of this works! Apparently, Porkbun does both: on my another domain, aedge.dev, it shows REDACTED FOR PRIVACY and replaces org name with “Private by Design, LLC”. For notpushk.in, it does show my country (RU... looks like I haven’t updated my address in a while lol) but everything else is redacted, too.
Spaceship on the other hand doesn’t bother and returns only this tiny response:
Domain Name: lunni.dev
Registry Domain ID: 4AF9AE073-DEV
Registrar WHOIS Server: whois.nic.google
Registrar URL: None
Updated Date: 2025-03-10T13:01:35Z
Creation Date: 2022-12-11T02:30:54Z
Registry Expiry Date: 2025-12-11T02:30:54Z
Registrar: Spaceship, Inc.
Registrar IANA ID: 3862
Registrar Abuse Contact Email: abuse@spaceship.com
Registrar Abuse Contact Phone: +1.6027723958
Domain Status: clientTransferProhibited https://icann.org/epp#clientTransferProhibited
Name Server: coco.bunny.net
Name Server: kiki.bunny.net
DNSSEC: unsigned
URL of the ICANN Whois Inaccuracy Complaint Form: https://www.icann.org/wicf/
>>> Last update of WHOIS database: 2025-03-17T17:11:09Z <<<
Edit: or, rather, that’s what whois.nic.google returns for a domain registered in Spaceship.
According to German law every website who is owned and operated by a person or entity in Germany needs an imprint with full name, address, email address and phone number… (of the owner 2 owning entity)…
a) This is only for commercial websites although what counts as commercial is vague and probably not something you want to argue in court so it's safer to just add it unless you are absolutely sure.
b) You need a valid postal address where you can receive mail but this doesn't have to be your home address. A PO box is fine.
c) You don't need to have a phone number in your Imprint.
The base requirement of commercial operations having to have valid contact information (that can be used for legal communication) is pretty sensible. The details could be a bit friendlier towards individuals running purely personal sites.
So this in practice is a massive push to centralization: if you have a Facebook page or Instagram account, you don't need to risk that level of privacy compromise.
At the same time, expecting that your NAP info isn't already in the hands of anyone who wants it makes no sense in this day and age.
Between the countless DB leaks and numerous infostealer campaigns, and considering that anyone who has you in their contacts list is extending the exposed surface area, it's untenable. Other events like marriage and home ownership further complicate any attempt to keep your name and address private.
Not saying you shouldn't opt for domain privacy, just giving a reality check. To really enforce your privacy you have to have multiple phone lines and a shell company, at the least. And really, even that isn't enough unless you can also commit to being a hermit.
There is a tangible difference between some people having this data somewhere out there, and literally anyone who wants to have it being able to look it up in a few seconds using tools already installed on almost every computer anywhere.
The ability to look up the correct contact details for a commercial enterprise on that enterprise's website is a good thing imo. It is (or was) part of the EU requirements for commercial websites (anything selling, giving purchase advice, advertising, ...).
It's a useful filter, a seller without identifiable people and location is a big red flag.
Exactly. All their info was scraped long ago. Whois and abuse info, it all needed to be depreciated a few decades ago. But, pity the poor fool who actually contacts me. I treat them like regular scammers. Get all the info, and then tell them to pound dirt.
Except for the guy who tried to sell me annuity liquidation. Yes, if the person gets unalived earlier than expected, you win.
In related news, I saw someone buy $150 worth of lottery tickets, as I was on the way to a large hospital to visit a sick friend. The lottery guy I am sure lost, and the hospital guy (profit-care) won, while the ward was understaffed( a profit-center). And 7 out of 8 fare collection machines were out of order ( deferred maintenance as a profit-center). I get the distinct feeling that corporate America, just does not even care in the slightest.
For the organization that managed the WhoIs? The horse left the barn so long ago, it's great great great grand-children are old and gone. Long gone.
you just have to have enough money to have some legal entity register on your behalf and that legal entity then has their system spammed, but they have their phone public anyhow...
the idea is to have individuals accountable while not annoying owners.
in that sense it makes _perfect_ sense and works as intended.
a proper solution ingredient would be trustworthy and affordable pseudonymity, and that can be lifted by court orders only. but then who guarantees the independence of courts? and the fairness of laws?
So .us is more trustworthy than .com. Good to know.
Im one of those that think that developers are hiding too much, which makes things like vs code extension viruses rampant.
I wont force you to not be anonymous, but if you are going to run your software on my device I want some accountability. Our salaries should also reflect that.
So far I haven't encountered a single actual virus, and if you're referring to the recent Material Theme debacle, there was never any malicious code involved, only third party libraries with obfuscation.
I think I understand your point, but your wording leaves some ambiguity. If I am running my software on your device you must be a cloud provider. In that case, the accountability you are looking for is probably not provided in the same way it would be if you were running my software on your device.
Either way, your aversion to anonymity of developers is interesting. It's a discussion for a different thread, but I think an important one.
It would be nice to find such a thread. This is a pet peeve of mine.
It’s one thing if you have a PO Box, and it’s consistently used in your various documents and registrations. I get wanting a firewall to direct availability.
But if I can barely find evidence you exist other than your software, or if you operate a fairly large scale service and you haven’t filed a yearly required corporate report (a specific example I recently came across), then those are red flags to me. Not immediate showstoppers necessarily, but if you’re trying to get me to make a purchase, I probably won’t.
It’s fine if you have domain privacy turned on, but you’re selling me software or services you have got to offer some kind of evidence that you have some kind of business nexus someplace. In a business context, I’ve got to know that for avoiding sanctions violations at the least.
Dear User,Our system has identified an unpaid toll charge linked to your vehicle. To avoid additional fees or service disruptions, please settle this matter within 12 hours.
Best of luck trying to get an unknown Chinese registrar to stop their spam. My carrier does not even have a clue. My routers now block anything *.Xin. Anything and everything.
> The concept of WHOIS has felt sleazy for many years.
More recently, yes. But the original (perhaps naive) goal was to keep domain owners accountable for whatever they were serving from hosts under their domains. That seems reasonable, at least on a more "polite" internet, where things weren't scraped and monetized and SEO'd into garbage.
The general purpose of publicly accessible registrant data is that people should be able to contact the owner of the domain in case of an issue, rather than the registry or registrar. "domain privacy" is simply the registrar putting themselves as the domain contact and becoming a forwarding service to you.
For large companies, and registrants under those ccTLD's that require local presence, it not uncommon that a legal firm acts like a proxy for the domain owner. This is a service that they take a few dollars for, and is in many ways similar to domain privacy.
The requirement of having the registrant as the contact person for a domain is something that (to my knowledge) comes from ICANN, and I think it has a positive effect. A domain should be owned and controlled by the registrant and not the registrar, which is then reflected in the contact information. In an alternate history we could see that the registrar (or even registry) owned the domain and only leased it to the registrant, in which case the registrant's power would be limited to other online services that people "buy" today.
Web hosts competing based on who had the prettiest cPanel theme. The number of email accounts were allowed was something that mattered. If you were lucky enough to get SSH access, it was jailed and only really allowed you to move files around easier or edit something with vim/nano.
Oh, I have unintentionally become a GoDaddy customer (a company I have spent ample time hating and shitting on over the years) because I was a legacy Media Temple customer going back to like 2006 and I still just can't be bothered to clear out everything on those sites/domains and they eventually got acquired
Let's encrypt has done great work with certs for free. But they do still cost money. Insane for how long unencrypted traffic was the default. But i could not have done anything, if browsers had soft-enforced https earlier. I simply could not have paid that money.
I still can't get my head around why a .com costs $9.59 (plus registrar margin)
There are 160 million registered .com domain names.
I understand that operating root servers isn't free, but surely they don't cost $1.5 billion per year! Wikipedia's hosting costs are $3 million per year, for comparison.
Only $0.18 goes to ICANN, the non-profit. The rest goes to the Verisign which is a publicly traded for-profit company which ultimately gets that $9.59. I bring this up because it of course _doesn't_ cost that much. Incidentally, Verisign posted $1.56 billion in revenue last year and spent about $1.21 billion on stock buybacks in the same time.
Because that doesn’t solve the problem. The demand doesn’t go away if you charge less – if you charge $1/yr for .COMs, they will all be permanently squatted. (Well, like now, but worse!)
We could use anti-scalping techniques, but that’s non-trivial to implement. Perhaps some name squatting policy? No idea how to enforce it though, especially without money.
Yeah, that’s a good point. Then again, you can also that for any other gTLD (why should Google get the proceeds from .dev?), and that would be a valid question.
I think the current system is inherently flawed... but it kinda works, and nobody wants to figure out the politics of fixing it – so I guess we’re stuck with it for a while.
Of those 160 million, what percentage of them are on the 1-year renewal plans, and how many of them are on multi-year plans. I'm guessing the vast majority are yearly. It would be interesting how many of them never get re-registered after the first year
I agree that it's ridiculous, but absent some sort of regulation, things are not priced based on how much they cost the provider, but based on how much people are willing to pay. Even if they're unhappy about it.
The thing is there are supposed to be regulations. .com is not privately owned but a public good that is supposed to be regulated by ICANN with the interests of the public in mind.
Just in case: you can get a .com for less than that nowadays, sometimes $3 for the first year (then transfer it back and forth for $5–7). Here are some price comparisons: https://tldes.com/com, https://tld-list.com/tld/com
I assume some registrars sell these at a loss and expect to offset that by selling you WordPress Supreme Ultra Enterprise hosting for... $40/yr? No idea how this works.
Because it's a natural monopoly. Nobody ever got taken seriously with a .biz address.
(.com is basically price-regulated because of this, FWIW, Verisign can't just raise prices whenever or however it wants. But obviously it's still a pretty sweet deal for them, I'd imagine.)
Hell, even .net will lose you traffic. If someone has your desired name with .com so that you use any other TLD, you will lose traffic. If your .com is taken by someone in the same line of work and not just a coincidental use of the same domain, then you'd be insane to not change the domain. I'm not sure how many people manually type domains in any more (I do though), and .com is muscle memory.
They always list it in the line items and in the renewal but whatever. In fact, it looks like I forgot to turn on auto-renew on their domain privacy product so it's sitting there in the 'grace' period. They work as a registrar so I use it.
> The concept of WHOIS has felt sleazy for many years.
The concept of most internet things has felt sleazy for many years. Right around the time that businesses started monetizing the internet is when that feeling really kicked off tbqh
> the registrar will basically extort me a couple extra dollars per year for “domain privacy” for the privilege of not having my name, home address, phone number, and email publicly available
I was going to buy a domain back in my student days, but I stopped when I realised I didn't have a phone number. I used the public phone-box on the corner whenever I needed to actually call anyone. It was a little annoying to have to register a phone number when I didn't actually want anyone to call me.
GDPR is what changed this. Before that, registrars had little incentive to hide it for free when they could instead charge you for the service. It was not trivial that Google Domains (rip) came with free privacy proxy right from the beginning.
It not so much that registrars had little incentives, but rather that GDPR defined the concept of legitimate interest as the definition for when registries should give out public information about domain ownership. That allows the contact information to still point to the correct domain owner without going through a proxy, while still creating a small hoop for parties interested to extract ownership information from the registry.
One can see this in practice in that company registration information is usually still available (through often behind a captcha), while personal information of private registrations require additional steps to demonstrate a legitimate interest. All this is also generally occurring at the registry level, rather than at the registrar.
It should be mentioned that privacy proxy is very similar to a straw man registration. If the registered owner is the proxy, then you are trusting that the proxy will honor the contract that is linking you with the property.
So I've walked past Lennart Poettering's house before without knowing it. (And that is not the sort of area where I'd have guessed he would live.)
If I were some kind of crazy maniac, I could pay him a visit and shut down systemd for good. You see why having this information out there is dangerous?
For .pl TLD, due to GDPR, domain data is hidden by default for private individuals (as opposed to companies), yet some registrars still try to upsell the "domain privacy", hoping you don't know about it.
Note that it is being replaced with a different protocol, is there any indication that there are less stringent requirements on identity data disclosure on the new proto?
Interestingly, when discussing WHOIS with my networking students, I discovered .edu WHOIS is not (cannot?) hidden. I suppose EDUCAUSE either requires WHOIS to remain open or they do not offer information hiding.
Doing some WHOIS lookups, we found a point of contact at a university, called the network admin said hello and launched into an impromptu network admin interview. It was cool stuff. I emailed him later in the day to apologize to and thank him for being a good sport about the whole thing. He (fortunately) found it all rather enjoyable.
Some other TLDs, like .us and .in, also forbid WHOIS privacy. TLD owners are free to set whatever policy they want around this. Perhaps .edu does the same.
It's useful for checking if a domain name is taken without doing that through a registrar, which is both less convenient, and (in case of shitty registrars) can be sold to domain speculators.
Both give you a way to find out the domain's registrar, registration date, transfer status, and administrative contacts like abuse@. Nameserver data can also be somehow useful.
Otherwise, what did you expect the registrar to divulge to you, a random passer-by?
As an Australian, I can look up the ownership of random properties in the US for free. But if I want to do the same for a building on my own street, I have to pay a US$11 fee per a property searched.
The US has a reputation of being a hypercapitalist society, yet they seem to be behind Australia in the descent into hypercapitalism by not (yet) privatising the registration of land titles. [0]
Considering Australia (SA) invented the concept of the Torrens Title which means that we don't have to pay extra to protect a piece of paper, and that the Titles Office has always charged for access to titles, I don't think that this is the "hypercapitalism" hill to die on.
It also means that banks can't sell mortgages out from under their borrowers because all liens and other finanacial liabilities attached to a title are known.
It doesn’t because you can negotiate a bulk discount. If you want all the titles, they’ll sell that to you - for a huge fee, but still a big discount off paying for them all individually. So essentially it prevents mass scraping by individuals and small businesses, while posing no real obstacle for megacorps with megabudgets
Wow. I never noticed how much how I used the internet changed. I haven’t done a WHOIS in a decade.
When I started using the internet, it’s how I contacted people. If I liked their site or their blog, I’d check who was behind it and get an email address I could contact.
Now… humans don’t really own domains anymore. Content is so centralized. I obviously noticed this shift, but I had forgotten how I used to be able to interact with the internet.
My only nitpick is that humans still own domains, but I agree with the overall sentiment and thank you for sharing this perspective.
It is fascinating to consider how our experience with the internet is changing over time.
Remember phreaking? Having been born in the Netscape era, I certainly don't, but I can imagine that losing the ability to pull that trick off must have felt like a loss to those who were initiated in the art.
Thankfully the trend appears to be that new technologies and thus new 1337 h4x are still forthcoming.
I think in most ways it's better, it makes the web more approachable to less technical users, making it less gate-keepey, but I also kind of miss the loosely-coupled cluster of web pages from the late-90's and early 2000's web.
Stuff felt less homogeneous; everyone had kind of a loose understanding of HTML, and people would customize their pages in horrendously wonderful ways. It felt more personal.
So many tech people have a fondness for that time. To me, it was a very narrow slice of the human experience. Today I can find sites and communities on any subject I can conceive and billions more that I cannot.
And personally I found it more horrendously ugly than horrendously wonderful. But that's just my opinion.
Yeah, as I said in most way things are better now than they were in the rose-tinted memories of the late 90's and early 2000's. Now if you want to say something on the internet, you can open up a Substack, or a Bluesky, or a Medium, or you can find a niche Subreddit. You don't need to know anything very technical, and that's a good thing.
I'll acknowledge that the old web was ugly, even at the time. I guess I just liked how much of it was, for lack of a better word, "custom". Most people were pretty bad at HTML, common web standards really hadn't caught out outside of "make it work in Internet Explorer", and CSS really hadn't caught on, so people glued together websites the best that they could.
Most websites looked pretty bad, but they were genuine. They didn't feel like some corporation built them, they felt like they were made by actual humans, and a lot of the time, actual children. I was one of those children.
I posted about this a week ago [1], but my first foray into programming was making crappy websites. It felt cool to me that a nine year old could make and publish a website, just like the grownups could. I didn't know anything about style so I had bright green backgrounds and used marquee tags and blink tags and I believe I had a midi of the X-files theme song playing in the background.
I guess it's the same sentimentality that I have when I look at a child's terrible drawing or reading one of my old terrible essays I wrote when I was eleven years old that my mom kept around. They're bad, they're embarrassing, but they're also kind of charming.
> Yeah, as I said in most way things are better now than they were in the rose-tinted memories of the late 90's and early 2000's. Now if you want to say something on the internet, you can open up a Substack, or a Bluesky, or a Medium, or you can find a niche Subreddit. You don't need to know anything very technical, and that's a good thing.
By 1999 you could create a LiveJournal or find a niche forum through Google. You didn't need to know anything very technical.
You could, Xanga as well, but it was still less connected. People complain about recommendation systems on YouTube and Facebook and Reddit, but one thing that they do well is give people more reach that they probably wouldn't have gotten before.
I've found so many interesting YouTube videos from people that I haven't ever heard of, just because of YouTube recommending them to me. Stuff like that didn't really exist for quite awhile; for a long time the best you had was aggregator sites like ThatGuyWithTheGlasses.com or similar sites.
> I think in most ways it's better, it makes the web more approachable to less technical users
There's a big gap between looking up someone's contact info using a protocol that many tools and websites implement (anyone can open www.who.is from search results) and the second example of needing an understanding of HTML to make a webpage. I don't think it's gatekeepey to be able to email the human behind a given website, whereas the current internet is full of walled gardens, gatekeepers, and faceless/supportless services (thinking of Discord, Cloudflare, and Google as respective examples)
We can have both human-run services and WYSIWYG website builders on the internet concurrently
Less gate keepey? Big Tech is literally the gatekeeper. Want to see a story without account? Too bad. What to see what events are going on without Facebook account? Too bad. Want to search discord or twitter. Too bad. Big Tech sucks in all user content and then hides it behind paywalls.
I think a lot of people fail to appreciate that the alternative to big tech taking over was not keeping things exactly the same as they were 20 or 30 years ago, but developing in a different direction.
It was the direction in which people expected things to develop: decentralised and democratised. There was a lot of optimism about empowering individuals.
I sometimes use whois multiple times in a day lol.
Should it exist? Maybe not, probably not, but that doesn't stop me from using it when I want to try to do some sleuthing. Most of the time though it doesn't work because they have privacy enabled.
I did get screwed once with certain TLDs not being able to enable privacy. I had registered a .at domain to use with a video site I had that at the time was reasonably popular and going viral fairly regularly. I hadn't realized beforehand that privacy wasn't possible, but once I learned, I didn't love it, but I wasn't sure if it would matter that much. I was wrong. I was getting calls and emails regularly from random people on the internet who found our content on reddit or whatever and decided to do some sleuthing
That works great until the TLD decides you need to hop through extended verification and fork over an identity card and a recent (3 months) invoice showing the address you signed up with 12 years ago, freezing your domain such that you can't update the information to be your current address even if you wanted to share that with the world (because privacy doesn't exist and GDPR doesn't apply in French-run/France-headquartered AFNIC). There's no time to dispute it or go back and forth: the initial email already comes with the announcement that your domain will go dark if they haven't processed your response after 14 days. Oh yeah, and you need to submit this via plain text email. If you send a link to the pdf scan, so that you can remove it after they've viewed it, that gets rejected (but it will be downloaded by an overseas system, run in the USA, within seconds of sending it), they'll respond that it specifically needs to be an attachment so that it will linger in their inbox forever
If you use fake info in relation to WHOIS data, you also need to be prepared to forge an identity document (a pretty bad felony in most countries per my understanding)
That said, on most forms I enter fake info because they they have no legitimate use for it anyway and they also can't compare it against anything. Buying a game or event ticket needs my address? For what, linking my purchase to a profile they're building? Nah, fake address it is
I did a Whois last week to prove to my previous registrar that I'm no longer with them, and that the invoice they sent was invalid. Unexpected use-case, but useful.
On the other hand, I did a WHOIS days ago to check up on a potential scam site my partner landed on while working on an e-commerce platform. I hope some alternative exists, people using Let's Encrypt leaves an entry in the transparency log but people don't necessarily need to use that. I haven't researched the alternatives to WHOIS yet but now I'll have to.
Although shit did happen back in the day. Someone show up at the house of the DeviantART CEO in like... I wanna say like, mmm.. 2007? and slashed his tires etc. WhoIs was only cool in the 90s.
Bit deceptive to editorialize it into something that sounds like something else much more interesting (removing contact info from domains) but isn't the case at all (they're just changing the method to access the same info).
I like WHOIS with its extreme simplicity [0]. RDAP, on the other hand, works on top of a large and changing HTTP [1], and uses a JS-derived serialization format [2]. RDAP has advantages, such as optionally benefiting from TLS, the data being better structured and defined, but the cost in added complexity seems high.
It's a bit unreasonable, IMO, to criticize the fact that RDAP communicates using a JSON API -- while JSON is inexorably related to JavaScript (and it's not without its issues), it's ubiquitous on the modern web for serializing data, in any even vaguely REST-shaped API.
You could argue that a more compact, binary, wire format is more appropriate (though I wouldn't, in this case, since for small, simple payloads, I think simplicity and human readability trumps sheer wire efficiency). You could argue that JSON's a poor serialization language in general (which is debatable, contextual, and in this case, I don't think there's a widely-accepted better option).
But let's not act like "a JS-derived serialization format" is some kind of mark of the beast here.
As far as I can see, an RDAP request is a simple HTTP request, looking like http://example.com/rdap/ip/192.0.2.0. Web servers still support HTTP/1.1 (or probably even HTTP/1.0 and HTTP/0.9). This is trivial to implement for clients. A simple HTTP request like that is about he simplest thing to do. You'll have to use curl or wget instead of netcat if you want to do it manually. No big deal.
"A JS-derived serialization format" ... You mean JSON, which is about the lowest common denominator in Internet data exchange these days (and has been ever since we found out that XML was overly complex and JSON was much easier to use). You'll have to use something like jq instead of grep to extract information from the data manually. Or rather, you'll be able to use the powers of jq. Again, I don't really see the problem here.
I did not mean that there is a problem with it, only that I appreciate the simplicity of WHOIS. While HTTP-with-JSON is perhaps the most practical solution these days.
To clarify my point of view, an ad hoc HTTP client for this indeed should not be hard to write from scratch, demonstrating that there is not much complexity in that. The server part would be a little more tricky; still doable, but not as easily as for WHOIS, and in most cases a more sensible approach would be to use libraries (or a program like curl, in case of shell scripting or manual usage) for that, as you said. Likewise with JSON: though one can deal with it as with text, some added tools (a library or jq, depending on context) would be sensible to use. But then added dependencies lead to all kinds of issues in non-ideal conditions (e.g., when it is problematic to install those). But again, I am not saying that this should stop adoption of RDAP.
On top of that, a complete and proper HTTP 1.1 implementation, server or client, would be quite large. And JSON, while indeed common and not particularly complicated, still has bits I find awkward (no sum types or a standard way to encode those, but has "objects", arbitrary-looking primitive types; no single standard for streaming, either), so working around it is not exactly pleasurable. Those add up to a difference between a trivial protocol and, well, a non-trivial one. I appreciate such trivial yet working and useful solutions, though the other kind is commonly useful as well.
Most people won't even notice this change. They'll still go to a "whois lookup service" and input a domain, and get the same results. The fact that it arrived via a different protocol (RDAP) won't mean anything.
I do. The terms of the domain registration say that providing incorrect information can result in revocation of the registration. Not really worth the risk, IMO, for any domain I actually care about.
Not just that, but also if the registrar turns out to be fraudulent or someone convinces your registrar to transfer the domain (scam the support team), or they get your account password and transfer the domain that way (data leak elsewhere, password reset with a sim swap, you name it)... there are so many ways you can have "technical difficulties", but in the end: you're the one with an ID card that has your name on it. You can take the TLD to court and have them give you back the domain that was legally purchased in your name
Except if it's not in your name
So yep, as you say: make this decision (fake or real information) knowing the risks involved in not legally owning it
I'm not sure this follows. You're allowed to publish, say, a book or pamphlet without signing it with your legal name and address. So is a website more like a book, or a building?
Somewhere in the middle IMO. If the domain name is desirable it looks more like a building, because people generally care about who owns the land when it is not getting put to good use.
Websites are more like books when they have a domain no else else cares about.
So, maybe require official ID/address/contact info for any domain over a certain price? Or for all domains under a certain character count, maybe, which could vary for TLD.
How do you determine the value of a domain name? Also there's nothing particularly valuable for most short domain name strings except on .com. It's generic words that tend to be valuable, not a short random string.
Domains point to IPs, and IPs already have subpoenable ownership records at RIRs. In the real estate metaphor: we have property ownership records, but we don't have records of every rental tenancy.
That's not true. Those are registration records NOT ownership records. People do not purchase ip address or domains. They register them for temporary use.
For non-legacy allocations, point taken (but my original comment still stands if you replace "ownership" with "registration"). For legacy allocations, it's more complicated.
This article is not inconsistent with my comment. The court rejected a subpoena against the ISP for the identity of the user of the IP, not against the RIR for the identity of the owner of the IP. This is like the court rejecting a subpoena against the landlord for their tenant's identity.
ICANN accredited domain registrars (so any registrar selling generic TLDs like .org, .com, .design etc) have contractual obligations related to technical abuses like phishing, malware, and botnets, insofar as they intersect with a domain name.
Content/expression related harms are outside of ICANNs bylaws and any obligations related to what a domain points at are not from ICANN, but from the laws in the jurisdiction in which the registrar operates. This is generally good. There is no global standard for acceptable limits on expression, with the possible exception of CSAM which is illegal everywhere.
Requiring domain registrars to arbitrate what content should be accessible via the DNS is perilous.
(or brew install, etc., depending on your os and tooling). The jq formatted output is a little more verbose than the whois one, but three cheers for a well-specified machine-parsable format. (and rdap has a pretty-printed format output also)
Back in 2014, when TLD .church was introduced, me and my friends tried to register alonzo.church and (ab)use the contact information records to provide some biographic information and links, explaining literally whois alonzo.church on the command line. That would not prevent hosting whatever services on that domain as normal.
Sadly, we were not able to secure the domain on time, and after 11 years, the attempted trick is becoming irrelevant.
I don't play with domains all day, but this very much feels like nothing important was accomplished, and things are just being made more complicated for political reasons. Sorry if that is being harsh, but I've never had any issue using WHOIS.
If you've ever tried to parse WHOIS programmatically, you'd realize that it being an unstructured blob of text is actually quite unconducive to it being useful. Having every endpoint return a standardized JSON payload specified in an RFC is much better.
I've had domains registered for over 30 years. I liked WHOIS because it provided a means to report abuse, which has gone from zero 30 years ago, to massive amounts of daily spam and network probes.
I was not happy when ICANN began to allow privacy features in domain registration data, and I never made mine private. Most reputable sites still provide contact information via WHOIS.
Hopefully RDAP will be a suitable replacement. I haven't tried it yet.
> I was not happy when ICANN began to allow privacy features in domain registration data, and I never made mine private
The issue for me is that you can't simply publish contact information. It requires you to either publish a legal owner in full or nothing. I can't publish abuse@example.org as contact method (because, yes, I do want to receive an email if someone finds an issue with my services), I need to publish also a legal name, address, sometimes a phone number. Those things cost money to set up to be fake-but-legit (burner SIM card, rent a letterbox somewhere, get someone else to submit their name and ID card) whereas an email address is inconsequential to publish and I can rotate it monthly to avoid it becoming enrolled on too many spam lists
So my sites never provided contact info via WHOIS when I could avoid it, yet I'd think my sites are as reputable as they come. You can always find a plain old email address via some link on the homepage and I have no spam filter (just email address rotation) so there is no chance that you're algorithmically filtered out, either
In the case of YC they defer to the AWS dns admins but you can set it to whatever you want unless your DNS provider does not let you. I've always run my own DNS so maybe that's less of an option for hosted DNS these days for all I know.
I have no doubt some of the benefits are definitely to be able to resell or access that data once again. I literally just told someone yesterday “don’t pay for domain privacy, any registrar worth a damn will include it anymore”
If you `ping`, your recursive resolve (like Google DNS, or your ISP DNS servers) will do the recursive lookup for you.
WHOIS data are irrelavant to resolving the host IP address. The SOA will be used to find the primary name server (for an AXFR lookup perhaps), but generally, each NS entry will work in a round-robin fashion and SOA isn't queried.
Most resolves just ignore duplicate records, but I imagine some resolvers may change the "odds" to likely pick the duplicated NS entry.
Finally, most authorative resolvers do not want to spend resources on ANY queries and almost always don't return all records, or like you saw, do not de-duplicate answers.
> Do you know why the name servers are part of the WHOIS data?
The NS returned from the registrar's WHOIS server reflects the registrar's view; the NS returned from the TLD nameservers reflects the registry's view; the NS returned from the zone's authoritative nameservers reflects the registrant's view. These should typically be the same, but can differ.
> why is the name server present in SOA record too?
The NS in the SOA record is used for RFC2136 dynamic updates and RFC1996 zone replication.
If you're trying to debug why a website's setup isn't working, the first step is to see if what the registry thinks the nameservers should be matches what the nameservers in DNS actually are. These can fall out of sync if e.g. the registry's connection to its DNS provider is experiencing issues. This does actually happen from time to time.
The NS record wins. The data in WHOIS is just non-operational metadata, WHOIS is not used for lookups.
Which server gets used is usually randomized from the set of possible ones. Same for which of multiple A or AAAA records are used to connect to.
Us sysadmins would love to be able to specify weights or round robin or retries (like with SRV records) to move load balancing and failover to the clientside but for whatever reason browser vendors have rejected this for years.
In practice it will round-robin because all of those guys have the same performance characteristics but through whoever else is upstream of you in the DNS chain. The SOA isn't used for resolution so it doesn't matter there.
The NS records and the WHOIS should be the same usually. One comes from the registrar's configs and the other from your next level upstream resolver (which should, unless it's cached and a recent change happened, be the same). But the thing that is used is whatever your next level upstream resolver is, which is the `dig` output unless you did `dig @someoneelse`.
The SOA nameserver is pretty much only significant for DNSSEC these days. In the AWS case there, I don't think it does anything unique. Pretty much there just to meet the standard.
I remember in the past I've managed to screw up my setup so that the name servers on WHOIS and name servers on DNS NS records mismatched. I can't remember which record won during name resolution.
I guess I still don't understand why the name servers need to be both in WHOIS records and DNS NS records. Does the name resolution use the name server data in WHOIS records in any form or manner?
In short, name resolution does not use the records in WHOIS.
Think of the WHOIS information as more of an administrative database, and the actual DNS servers (which are located at the location of the NS records) as the operational database.
It is useful to know, in your administrative database, how to get to the organisational database, but it does not hold all of the information -- just where it is located.
In operational contexts (actual DNS lookups), you only use the operational database (the nameservers).
In administrative contexts (transferring a domain between registrars), you use the information from the administrative database (WHOIS).
There are additional wrinkles, like GLUE records, but those are probably a bit beyond the scope of what you're asking.
People say WHOIS is useless these days due to WHOIS privacy, but it's useful for at least one thing: checking when a domain was registered/transferred. Fishy stuff tend to be registered/transferred recently. Also older and larger companies tend to not hide their organizational identity.
Btw, I tried the icann-rdap CLI tool and the default rendered-markdown output mode is atrocious. Sea of output, each nameserver has one or more standalone tables taking up 15x$repetition lines, almost impossible to fish out useful info. The retro gtld-whois mode is so much cleaner. Their web tool https://lookup.icann.org/en/lookup is fine too, don't know why the rendered markdown mode isn't like that. WTF.
I can remember times when you could still see the names and addresses of registrants in whois records. That was before abuse and fraud became everyday occurrences in today's internet.
I miss the times when we could still believe in basic human decency.
The main benefit of whois and RDAP is to see which registrar handles a domain and when there were recent changes or upcoming expiry etc. RDAP is also useful to see who operates an IP address etc. I've been using RDAP for a few years but the service has been spotty, hopefully that improves now.
One bright side of ICANN being a California non-profit is that when they tried to sell off .org to their own confederates so they could juice up the prices they were stopped from doing it. If they were in other places, I imagine it would have gone through.
it was fun when having a network solutions/internic contact handle was a badge of honor.
the early internet was fun. whois was always a fun dimension.
is there a canonical rdap client that will end up everywhere? one of the nice things about the early Internet was that there were canonical utilities that were everywhere.
Whois needs it's own port open usually, this is good I suppose, now it's all HTTPS. Now, if only passive dns resolution data was part of this same api. As it stands today, if you're looking into WHOIS information, historical WHOIS and passive dns are a must, and they are usually provided by commercial entities.
What does this mean for the command line tool whois? It definitely works still and it's still being updated...
> whois ycombinator.com
% IANA WHOIS server
% for more information on IANA, visit http://www.iana.org
% This query returned 1 object
refer: whois.verisign-grs.com
domain: COM
organisation: VeriSign Global Registry Services
address: 12061 Bluemont Way
address: Reston VA 20190
address: United States of America (the)
contact: administrative
name: Registry Customer Service
organisation: VeriSign Global Registry Services
address: 12061 Bluemont Way
address: Reston VA 20190
address: United States of America (the)
phone: +1 703 925-6999
fax-no: +1 703 948 3978
e-mail: info@verisign-grs.com
contact: technical
name: Registry Customer Service
organisation: VeriSign Global Registry Services
address: 12061 Bluemont Way
address: Reston VA 20190
address: United States of America (the)
phone: +1 703 925-6999
fax-no: +1 703 948 3978
e-mail: info@verisign-grs.com
It has already stopped working for domains on TLDs that have sunset WHOIS and over the next few months it'll stop working for a lot more TLDs and registrars. The command line tool is nothing more than a thin client that queries a server WHOIS endpoint.
Glad I read this, I wasn't aware whois was being sunsetted. Now I have to change one of my critical services to do rdap. Wow. How can you sunset the main service that is the backbone of the internet?
From what I've seen most domain servers don't really implement the history components of RDAP, which is a shame - being able to see if a domain ownership lapsed or was transferred historically would be great for being able to determine if somebody's email address is still trustworthy or has been stolen by a domain transfer.
Anyone experienced with this, I am not seeing abuse contact info, usually a phone number or email. Am i supposed to follow hyperlinks to get this info or something? Like search the registrar for this data?
ICANN's DNS servers is one of the only systems on the internet that requires people to continually pay money to have a name. X, YouTube, Facebook, Reddit, Twitch, etc all let you register a name for free and without submitting all of your personal information. The entire model here is outdated with what users want.
i’m glad it requires money. $1/month for a top level name isn’t much, and it means there are lots of good names available rather than all of them being grabbed by someone not interested in using them. when making a reddit account it’s actually pretty tricky to find a decent name that’a available
I think both models have a place. Sometimes I just really want a persistent identifier that I can take with me (unlike an IP) with minimal maintenance. Even if it is something unreadable like a UUID.
We should totally have a free .uuid TLD (which will predictably get blocked by 90% of networks... Although DoH would probably still work)
Twitch for example will allow you take over usernames of accounts that are unused. Also having a good name is less important than you think. Most people don't navigate by going to exact identifiers. They just type the name of the thing into a search and relevant results will be returned. Dead or useless results should not rank high.
...and to host associated services to resolve this name to an IP address, as well as administrative overhead
I'd rather not that my domain name is funded by ads and sponsorships, the way that "X, YouTube, Facebook, Reddit, Twitch, etc all" are (no love for open source or decentralised platforms btw? The more commercial the better, except when it costs you money?)
They’re pretty expensive, and the nature of the service means that if they disappear, they have ownership of your domain and you have no recourse to get it back.
That's the nature of 'private' domain registration used more commonly, at least to some degree for many private registrations. If you read the agreement, you are transferring your domain registration to the privacy service, and they forward stuff to you. I don't know what happens if they disappear, however.
Worse: if Njalla decides you shouldn't have a domain - for any reason whatsoever, including "we don't like your web site" - they can seize it, and you have no legal recourse.
You mean the "domains" that >99% of users can't even resolve, which can't be used to send or receive email, and which you can't have SSL certificates issued for? Don't be daft.
it's still unsupported by a lot of tld's and the rate limits are atrocious. some registrar's only allow 10 requests per day and will group huge netblocks into one single block.
I havent had a successful use of whois in probably over a decade. What was once a useful tool was destroyed by spammers harvesting email addresses and privacy oriented registrars.
I’m serious! I don’t know why we’re turning a fundamental command off, even if it didn’t work correctly for everything. Do you realize how much documentation and how many tools reference it? And it still can work.
This was announced originally early last year. It removes the requirement for TLD and nTLD (not ccTLD) operators to have a WHOIS service available, but doesn't mandate they must shut them down.
So far the sunsetting has had little effect with most TLDs still having their WHOIS services online. In reality, I think we'll see a period of time where many TLDs and nTLDs have both WHOIS and RDAP available.
Additionally, since ccTLD's aren't governed by ICANN, many don't even have an RDAP service available. As such, there's going to be a mix of RDAP and WHOIS in use across the entire internet for some time to come.
Disclosure: I run https://viewdns.info/ and have spent many an hour dealing with both WHOIS and RDAP parsing to make sure that our service returns consistent data (via our web interface and API) regardless of the protocol in use.
I think RDAP is going to be adopted by more and more ccTLDs as well. WHOIS is not a particularly well liked protocol (I was at an IETF meeting where ICANN did a presentation on the timeline and people were literally cheering for the demise of WHOIS).
Disclosure: Work in the ccTLD space.
100% agree that there will be more ccTLD operators that will implement RDAP. The sooner we're on a consistent protocol the better!
Self-plug: I run a little mastodon/activity pub bot that monitors DNS RDAP adoption according to the official bootstrap file: https://social.haukeluebbers.de/@stateofrdap
Last post from yesterday:
> As of today 82.25% (1187) of all 1443 Top Level Domains have an authoritative RDAP service declared.
> These TLDs were added:
> .ye
It's funny to see that a lot of services are finally moving from a human-readable / plain text format towards structured protocols right at the point where we can finally have LLMS parse the unstructured protocols :-)
Well you can't really trust an LLM to give you reproducible output every time, you can't even trust it to be faithful to the input data, so that's nice to have a standard format now. And for like a millionth of the computing resources to parse it. Also Whois was barely human-readable, with the fields all over the place, missing or different from one registry to the other. A welcome change that should have come really sooner.
we can't ever have LLMs reliably parse any form of data. You know what can parse it perfectly though? A parser. Which works perfectly, and consistently.
> Which works perfectly
... on conformant inputs, when it has no bugs.
On non-conformant inputs, a parser will barf and yell at you, which is exactly what you want.
On non-conformant inputs, there's absolutely no telling what an LLM will do, which is precisely the problem. It might barf, or it might blissfully continue, and even if the input was right you couldn't remotely trust it to regurgitate the input verbatim.
As for bugs, it is at least theoretically possible to write a parser with no bugs, whereas an LLM is fundamentally probabilistic.
Of course we can. Reliability is a spectrum, not a binary state. You can push it up however high you like, and stop somewhere between "we don't care about error rate this low" and "error rate is so low it's unlikely to show in practice".
It's not like this is a new concept. There are plenty of algorithms we've been using for decades that are only statistically correct. A perfect example of this is efficient primality testing, which is probabilistic in nature[0], but you can easily make the probability of error as small as "unlikely to happen before heat death of the universe".
--
[0] - https://en.wikipedia.org/wiki/Primality_test#Probabilistic_t...
There are two problems with this comparison. First, probabilistic prime generation has a mathematically proven lower bound that improves with iteration. There is no comparably robust tuning parameter with an LLM. You can use a different model, you can use a bigger variant of the same model, etc., but these all have empirically determined and contextually sensitive reliability levels that are not otherwise tunable. Second, the prime generation function will always give you an integer, and never an apple, or a bicycle, or a phantasm. LLMs regurgitate and hallucinate, which means that a simple error rate is not the only metric that matters. One must also consider how egregiously wrong and even nonsensical the errors can be.
I think the better statement is that, if, say, you're running the Miller-Rabin test 10 times, you can be confident that an error in one test is uncorrelated with an error in the next test, so it's easy to dial up the accuracy as close to 1 as desired. Whereas with an LLM, correlated errors seem much more likely; if it failed three times parsing the same piece of data, I would have no confidence that the 4th-10th times would have the same accuracy rate as on a fresh piece of data. LLMs seem much more like the Fermat primality test, except that their "Carmichael numbers" are a lot more common.
The general point is not that the feature currently exists to dial down the LLM parse error rate, it’s that the abstract argument “we can’t use LLMs because they aren't perfect” isn’t a realistic argument in the first place. You’re probably reading this on hardware that _probably_ shows you the correct text most all of the time but isn’t guaranteed to.
Precisely this. People dismiss utility of LLMs because they don't give 100% reliability, without considering the basic facts that:
- LLMs != ChatGPT interface, they don't need to be run in isolation, nor do they need to do everything end-to-end.
- There are no 100% reliable systems - neither technological nor social. Voltages fluctuate, radiation flips bit, humans confabulate just as much if not worse than LLMs, etc.
- We create reliability from unreliable systems.
LLMs aren't some magic unreliability pixie dust that makes everything they touch beyond repair. They're just another system with bounded reliability, and can be worked into larger systems just like anything else, and total reliability can be improved through this.
There's no such thing as a perfectly-watertight roof, therefore there's no qualitative difference between fixing the roof and buying a bigger bucket.
If you job is to be a referent, to have authority. You absolutely don't want to make any error. Pretty safe isn't enough, you need to be absolutely sure that you control the output.
You only have one job, don't delegate authority.
But isn't using LLM for that really expensive? Seems wasteful.
I wouldn't use LLMs, but if I did, I would try to get the LLM to write parser code instead.
If it can convert from one format to another, then it can generate test cases for the parser. Then hopefully it can use those to iterate on parser code until it passes the tests.
In a sense, asking it to automate the work isn't as straightforward as asking it to do the work. But if the approach does pan out, it might be easier overall since it's probably easier to deploy generated code to production (than deploying LLMs).
deepseek API costs are quite literally pennies per million tokens
My desktop GPU can run small models at 185 tokens a second. Larger models with speculative decoding: 50t/s. With a small, finetuned model as the draft model, no, this won't take much power at all to run inference.
Training, sure, but that's buy once cry once.
Whether this means it's a good idea, I don't think so, but the energy usage for parsing isn't why.
A simple text parser would probably be 10,000,000 times as fast. So the statement that this won't take much power at all, is a bit of an overstatement.
50 tokens per second. Compared to a quick and dirty parser written in python or even a regex? That's going to be many many orders of magnitude slower+costlier.
awk would run millions times faster, not to mention mawk and awka.
In order to make the point that
> energy usage for parsing isn't why
You'll need to provide actual figures and benchmark these against an actual parser.
I've written parsers for larger-scale server stuff. And while I too don't have these benchmarks available, I'll dare to wager quite a lot that a dedicated parser for almost anything will outperform an LLM magnitudes. I won't be suprised if a parser written in rust uses upwards of 10k times less energy than the most efficient LLM setup today. Hell, even a sed/awk/bash monstrosity probably outperforms such an LLM hundreds of times, energy wise.
How many times would you need to parse to get an energy saving on using an lm to parse vs using an llm to write a parser, then using the parser to parse.
It sounds like you need to learn how to program without using a LLM, but even if you used one to write a parser, and it took you 100 requests to do so, you would very quickly get the desired energy savings.
This is the kind of thinking that leads to modern software being slower than software from 30 years ago, even though it is running on hardware that's hundreds of times faster.
People not using The AWK Programming Language as a reference to parse stuff and maybe The C Programming Language with AWKA (AWK to C translator) and a simple CSP library for threading yeilds a disaster on computing.
LLM's are not the solutions, they are the source of big troubles.
> using an llm to write a parser
You're assuming OP needs an LLM to write a parser, since they mentions writing many during their career they probably don't need it ;)
I was thinking more of when a sufficiently advanced device would be able to “decide” the task would be worth using its own capabilities to write some code to tackle the problem rather than brute force.
For small problems it’s not worthwhile, for large problems it is.
It’s similar to choosing to manually do something vs automate it.
I didn't use an LLM back then. But would totally do that today (copilot).
Especially since the parser(s) I wrote were rather straightforward finite state machines with stream handling in front, parallel/async tooling around it, and at the core business logic (domain).
Streaming, job/thread/mutex management, FSM are all solved and clear. And I'm convinced an LLM like copilot is very good at writing code for things that have been solved.
The LLM, however, would get very much in the way in the domain/business layer. Because it hasn't got the statistical body of examples to handle our case.
(Parsers I wrote were a.o.: IBAN, gps-trails, user-defined-calculations (simple math formulas), and a DSL to describe hierarchies. I wrote them in Ruby, PHP, rust and perl.)
It’s not just about the energy usage, but also purchase cost of the GPUs and opportunity cost of not using those GPUs for something more valuable (after you have bought them). Especially if you’re doing this at large scale and not just on a single desktop machine.
Of course you were already saying it’s not a good idea, but I think the above definitely plays a role at scale as well.
You’re right, I could be trying to get Crysis to run at 120 fps.
If you have spare GPU time you could donate it to projects like Folding@Home.
My Atom n270 netbook with mawk and a few lines parsing the files with a simple regex will crush down your GPU+LLM's on both time and power usage.
My assumption is that models are getting cheaper, fast. So you can build now with OpenAI/Anthropic/etc and swap it out for a local or hosted model in a year.
This doesn't work for all use cases but data extraction is pretty safe. Treat it like a database query -- a slow but high availability and relatively cheap call.
While it will become cheaper, it will never be as fast / efficient as 'just' parsing the data the old-fashioned way.
It feels like using AI to do computing things instead of writing code is just like when we moved to relatively inefficient web technology for front-ends, where we needed beefier systems to get the same performance as we used to have, or when cloud computing became a thing and efficiency / speed became a factor of credit card limit instead of code efficiency.
Call me a luddite but I think as software developers we should do better, reduce waste, embrace mechanical sympathy, etc. Using AI to generate some code is fine - it's just the next step in code generators that I've been using throughout all my career IMO. But using AI to do tasks that can also be done 1000x more efficiently, like parsing / processing data, is going in the wrong direction.
I know this particular problem space well. AI is a reasonable solution. WHOIS records are intentionally made to be human readable and not be machine parseable without huge effort because so many people were scraping them. So the same registrar may return records in a huge range of text formats. You can write code to handle them all if you really want to, but if you are not doing it en masse, AI is going to probably be a cheaper solution.
Example: https://github.com/weppos/whois is a very solid library for whois parsing but cannot handle all servers, as they say themselves. That has fifteen + years of work on it.
But.. that’s exactly what this thread is about. RDAP is the future, not WHOIS.
Yes, exactly. Read what I was responding to.
I think you’re both right, and also both are missing the point.
Using LLMs to parse whois data is okay in the meantime (preferably as a last resort!), but structuring the data properly in the first place (i.e. RDAP) is the better solution in the long run.
Requesting that people think before transferring mission critical code into the hands of LLMs is not being a Luddite lol.
Can you imagine how many ridiculous errors we would have if LLMs structured data into protobufs. Or if they compiled software.
It's more than 1000x more wasteful resources wise too. The llm swiss army knife is the Balenciaga all leather garbage bag option for a vast majority of use cases
Still, I wouldn't use an LLM for what's essentially a database query: by their very nature, LLMs will give you the right answer most of the times, but will sometimes return you wrong information. Better stay on a deterministic DB query in this case.
As usual, arguments for LLMs are based on rosy assumptions about future trajectory. How about we talk about data extraction at that point in the future when models are already cheap enough. And in the meantime just assume the future is uncertain, as it obviously is.
https://deviq.com/antipatterns/shiny-toy
Off topic thank you for runnig viewdns.info. I don't use it regularly, mainly for the occasional WHOIS information lookup and it has always worked perfectly.
Thanks for the kind words and glad it's been useful :).
Hey, I've been looking for a tool that can do reverse NS lookup for a nameserver pairs (ie. which domains have nameservers ns1.example.com and ns2.example.com) but all the services out there that I've found can only do one. Is this something you would consider implementing?
It's kind of funny some operators have never had it in practice. For example, .es never had a public whois, and need to register with a national ID (and I think with a fixed IP address) to get access to it.
That need for a national ID hasn't been in place for a long time, AFAIK.
I have a .es (my nickname berkes, domain berk.es) for almost 16 years now, and live in the EU, but not in Spain. In the beginning I used a small company that offered services for non-spanish companies to register .es through them (I believe they technically owned the domains?). But today it's just in my local domain registrar without need for an ID.
That .es has no whois has struck me as somewhat of a benefit actually. Back in the days, it kept away a lot of spam from spammers that'd just lift email-addresses off the whois. My .com, .nl and other domains recieve(d) significant more such spam. Let alone phone-number and other personal details delivered over an efficient, decentralized network. Though recent privacy addons(?) have mitigated that a little.
Usually, the need to use an ID is only for private persons (and usually only if they are nationals). Anyone else should not need that. The general theory is that a nation can only verify data that they themselves have.
Some ccTLD's have rules against registrations by people not located within the country that owns the ccTLD, in which case a valid national id or organization number would be required. From what I can see, .es does not have that requirement.
For example, .es never had a public whois, and need to register with a national ID (and I think with a fixed IP address) to get access to it.
Is this new? I had an .es domain around 2011, and am not Spanish, or even European.
You don't need WHOIS to register a domain.
The concept of WHOIS has felt sleazy for many years.
If I register a domain, the registrar will basically extort me a couple extra dollars per year for “domain privacy” for the privilege of not having my name, home address, phone number, and email publicly available and then mirrored across thousands of shady scraped content sites in perpetuity. Even If you don’t care about that, then begins the never ending emails texts and calls begin from sleazy outfits who want to sell you related domains, do SEO for you, revamp your site, schedule a call, or just fill your spam box up with legitimate scams and bootleg pharma trash.
All because you wanted a $10/year dot com without paying the bribe.
And yes I grew up leafing through well worn phone books next to corded phones. This is not comparable.
This is about sunsetting the WHOIS protocol in favor of RDAP, not doing away with domain owner registration data.
It's crazy how many people just read the headline and choose to comment or upvote these links.
Also, why the title is not same as the article? It makes no sense.
To be fair, OP never said this was necessarily related directly to the article.
I’ll often post loosely related tangents like this because I would enjoy discussing the tangent with the HN crowd, but there’s often not a better opportunity to discuss it, so why not while we’re sort of on the topic anyway.
Ack that I don’t think it makes sense to discuss not even remotely related topics. But as long as it’s in the ballpark and it’s not going against other guidelines and leads to interesting discussion, I think it’s fine.
Indeed. Furthermore, the fact that there is still a replacement makes the discussion even more pertinent in this case, since OP is arguing for the abolition of any such protocol.
The site tweaks some words out of titles
I can’t downvote. Not sure about others.
From the link:
RDAP offers several advantages over WHOIS including [...] the ability to provide differentiated access to registration data.
In other words, it provides the ability to monetize and extract more money from people. Like we need more of that...
You clearly read that from a few miles. It is that obvious.
Somewhere enshitification fits all over the place.
Tangentially - RDAP was created partially to resolve issues with PII in WHOIS
That was a common racket a long time ago, but pretty much every widely recommended registrar offers free whois privacy now. At least when they're allowed to, some TLDs forbid obfuscating the whois information.
For example, *.us domain registrars aren't allowed to privacy protect your domain: https://www.reddit.com/r/webdev/comments/101qjbq/wow_never_b...
a little less than a year ago, my wife registered a .us domain that she ended up not using at all. she still gets phone calls nearly daily from people trying to sell her web design/dev work
Same with registry.in in India (for .in domains), where WHOIS privacy is not allowed as per the terms and conditions. [1]
[1]: https://www.registry.in/system/files/Terms_and_Conditions_fo...
That’s interesting! Porkbun happily redacts my data for notpushk.in.
I've wondered about this for a while now...
I have two .in domains with namecheap and whois data is all "REDACTED FOR PRIVACY" despite namecheap not allowing me to add domain privacy when I purchased the domains.
In fact Namecheap explicitly state that they can't provide privacy services for .in domains on this page: https://www.namecheap.com/security/what-is-domain-privacy-de...
I’ve looked into it a bit more, and turns out there are two options for redacting WHOIS data:
- “Privacy service”, which is these funky named LLCs replacing your data in the WHOIS
- Just the redaction, which replaces almost all data with REDACTED FOR PRIVACY (except for registrant's country, state, and organization name).
No idea why or how any of this works! Apparently, Porkbun does both: on my another domain, aedge.dev, it shows REDACTED FOR PRIVACY and replaces org name with “Private by Design, LLC”. For notpushk.in, it does show my country (RU... looks like I haven’t updated my address in a while lol) but everything else is redacted, too.
Spaceship on the other hand doesn’t bother and returns only this tiny response:
Edit: or, rather, that’s what whois.nic.google returns for a domain registered in Spaceship.Porkbun docs on WHOIS privacy options: https://kb.porkbun.com/article/97-new-whois-privacy-settings...
That is the kind of fact that if you talk about it online shortly gets 'fixed'.
If I had a dime for every comment I’ve deleted before posting or decided not to even write on the back of “better not shit where I eat”.
Wow! These policies are like 30 years behind. Exposing your phone number and address on WHOIS makes absolutely no sense in this day and age!
According to German law every website who is owned and operated by a person or entity in Germany needs an imprint with full name, address, email address and phone number… (of the owner 2 owning entity)…
a) This is only for commercial websites although what counts as commercial is vague and probably not something you want to argue in court so it's safer to just add it unless you are absolutely sure.
b) You need a valid postal address where you can receive mail but this doesn't have to be your home address. A PO box is fine.
c) You don't need to have a phone number in your Imprint.
The base requirement of commercial operations having to have valid contact information (that can be used for legal communication) is pretty sensible. The details could be a bit friendlier towards individuals running purely personal sites.
So this in practice is a massive push to centralization: if you have a Facebook page or Instagram account, you don't need to risk that level of privacy compromise.
Nope, Facebook or Instagram pages used commercially are also required to have an imprint.
A freelancer's sites are also considered commercial use.
And such sites without imprint have been fined & taken down.
If you engage in commerce, you need to publish enough contact information that others could serve you a court summons.
At the same time, expecting that your NAP info isn't already in the hands of anyone who wants it makes no sense in this day and age.
Between the countless DB leaks and numerous infostealer campaigns, and considering that anyone who has you in their contacts list is extending the exposed surface area, it's untenable. Other events like marriage and home ownership further complicate any attempt to keep your name and address private.
Not saying you shouldn't opt for domain privacy, just giving a reality check. To really enforce your privacy you have to have multiple phone lines and a shell company, at the least. And really, even that isn't enough unless you can also commit to being a hermit.
There is a tangible difference between some people having this data somewhere out there, and literally anyone who wants to have it being able to look it up in a few seconds using tools already installed on almost every computer anywhere.
The ability to look up the correct contact details for a commercial enterprise on that enterprise's website is a good thing imo. It is (or was) part of the EU requirements for commercial websites (anything selling, giving purchase advice, advertising, ...).
It's a useful filter, a seller without identifiable people and location is a big red flag.
I commit to being a hermit.
Exactly. All their info was scraped long ago. Whois and abuse info, it all needed to be depreciated a few decades ago. But, pity the poor fool who actually contacts me. I treat them like regular scammers. Get all the info, and then tell them to pound dirt.
Except for the guy who tried to sell me annuity liquidation. Yes, if the person gets unalived earlier than expected, you win.
In related news, I saw someone buy $150 worth of lottery tickets, as I was on the way to a large hospital to visit a sick friend. The lottery guy I am sure lost, and the hospital guy (profit-care) won, while the ward was understaffed( a profit-center). And 7 out of 8 fare collection machines were out of order ( deferred maintenance as a profit-center). I get the distinct feeling that corporate America, just does not even care in the slightest.
For the organization that managed the WhoIs? The horse left the barn so long ago, it's great great great grand-children are old and gone. Long gone.
Call me 1-800-555-1212.
you just have to have enough money to have some legal entity register on your behalf and that legal entity then has their system spammed, but they have their phone public anyhow...
the idea is to have individuals accountable while not annoying owners.
in that sense it makes _perfect_ sense and works as intended.
a proper solution ingredient would be trustworthy and affordable pseudonymity, and that can be lifted by court orders only. but then who guarantees the independence of courts? and the fairness of laws?
we're in a tough ride.
So .us is more trustworthy than .com. Good to know.
Im one of those that think that developers are hiding too much, which makes things like vs code extension viruses rampant.
I wont force you to not be anonymous, but if you are going to run your software on my device I want some accountability. Our salaries should also reflect that.
Im sure that this will be unpopular though.
>So .us is more trustworthy than .com.
How do you come to that conclusion?
>vs code extension viruses rampant.
So far I haven't encountered a single actual virus, and if you're referring to the recent Material Theme debacle, there was never any malicious code involved, only third party libraries with obfuscation.
I think I understand your point, but your wording leaves some ambiguity. If I am running my software on your device you must be a cloud provider. In that case, the accountability you are looking for is probably not provided in the same way it would be if you were running my software on your device.
Either way, your aversion to anonymity of developers is interesting. It's a discussion for a different thread, but I think an important one.
It would be nice to find such a thread. This is a pet peeve of mine.
It’s one thing if you have a PO Box, and it’s consistently used in your various documents and registrations. I get wanting a firewall to direct availability.
But if I can barely find evidence you exist other than your software, or if you operate a fairly large scale service and you haven’t filed a yearly required corporate report (a specific example I recently came across), then those are red flags to me. Not immediate showstoppers necessarily, but if you’re trying to get me to make a purchase, I probably won’t.
It’s fine if you have domain privacy turned on, but you’re selling me software or services you have got to offer some kind of evidence that you have some kind of business nexus someplace. In a business context, I’ve got to know that for avoiding sanctions violations at the least.
>Either way, your aversion to anonymity of developers is interesting
My personal take is that we need a society with a lot more trust.
I agree, although is the domain system really the best way to do that?
I mean people with names and faces will more than happily sell you out
"E-ZPass Outstanding Toll Notification
Dear User,Our system has identified an unpaid toll charge linked to your vehicle. To avoid additional fees or service disruptions, please settle this matter within 12 hours.
https://e-zpass.org-qrh.xin/indexshtml"
Best of luck trying to get an unknown Chinese registrar to stop their spam. My carrier does not even have a clue. My routers now block anything *.Xin. Anything and everything.
Apparently, Xin has not learned about hiding info: bj#xinnet.com (Change the # to an @ ). Some how someone lists it as "Elegant Leader Limited"
> but pretty much every widely recommended registrar offers free whois privacy now
If you go by the book e.g. Cloudflare not every field (e.g. state and country) is hidden. So not exactly.
> The concept of WHOIS has felt sleazy for many years.
More recently, yes. But the original (perhaps naive) goal was to keep domain owners accountable for whatever they were serving from hosts under their domains. That seems reasonable, at least on a more "polite" internet, where things weren't scraped and monetized and SEO'd into garbage.
The general purpose of publicly accessible registrant data is that people should be able to contact the owner of the domain in case of an issue, rather than the registry or registrar. "domain privacy" is simply the registrar putting themselves as the domain contact and becoming a forwarding service to you.
For large companies, and registrants under those ccTLD's that require local presence, it not uncommon that a legal firm acts like a proxy for the domain owner. This is a service that they take a few dollars for, and is in many ways similar to domain privacy.
The requirement of having the registrant as the contact person for a domain is something that (to my knowledge) comes from ICANN, and I think it has a positive effect. A domain should be owned and controlled by the registrant and not the registrar, which is then reflected in the contact information. In an alternate history we could see that the registrar (or even registry) owned the domain and only leased it to the registrant, in which case the registrant's power would be limited to other online services that people "buy" today.
You’re just using bad registrars.
https://porkbun.com/products/whois_privacy
Porkbun only came out in 2014
Two decades late on a problem
Oh the good ol days. $10/m for slow PHP shared hosting and $150 for an SSL certificate too.
Web hosts competing based on who had the prettiest cPanel theme. The number of email accounts were allowed was something that mattered. If you were lucky enough to get SSH access, it was jailed and only really allowed you to move files around easier or edit something with vim/nano.
Oh, I have unintentionally become a GoDaddy customer (a company I have spent ample time hating and shitting on over the years) because I was a legacy Media Temple customer going back to like 2006 and I still just can't be bothered to clear out everything on those sites/domains and they eventually got acquired
Let's encrypt has done great work with certs for free. But they do still cost money. Insane for how long unencrypted traffic was the default. But i could not have done anything, if browsers had soft-enforced https earlier. I simply could not have paid that money.
You and everyone else: unencrypted stopped being the default as a pretty direct consequence of increased accessibility of TLS certificates.
You could get free SSL certs before LE. What LE changed was making it possible to fully automate the process.
Yeah in late 90s telnet to server was the default. So all those delicious cli were just flowing in the Ethernet traffic in plain text.
How do they still cost money?
Or had to get an isdn line just to get an static ip for your clients to ftp the files
I still can't get my head around why a .com costs $9.59 (plus registrar margin)
There are 160 million registered .com domain names.
I understand that operating root servers isn't free, but surely they don't cost $1.5 billion per year! Wikipedia's hosting costs are $3 million per year, for comparison.
Only $0.18 goes to ICANN, the non-profit. The rest goes to the Verisign which is a publicly traded for-profit company which ultimately gets that $9.59. I bring this up because it of course _doesn't_ cost that much. Incidentally, Verisign posted $1.56 billion in revenue last year and spent about $1.21 billion on stock buybacks in the same time.
As I understand it, Verisign doesn't own the .com TLD, they are just a contracted service provider to ICANN.
Which begs the question, why doesn't ICANN just replace Verisign them with a different authoritative register that charges much less?
Because that doesn’t solve the problem. The demand doesn’t go away if you charge less – if you charge $1/yr for .COMs, they will all be permanently squatted. (Well, like now, but worse!)
We could use anti-scalping techniques, but that’s non-trivial to implement. Perhaps some name squatting policy? No idea how to enforce it though, especially without money.
Fair enough, but even we use a floor price to disincentivise squatting, I'm not sure why we should gift those excess margins to a private company?
Shouldn't ICANN collect that margin and use it for charitable purposes instead?
Yeah, that’s a good point. Then again, you can also that for any other gTLD (why should Google get the proceeds from .dev?), and that would be a valid question.
I think the current system is inherently flawed... but it kinda works, and nobody wants to figure out the politics of fixing it – so I guess we’re stuck with it for a while.
That’s a “$1.56 billion” question…
Of those 160 million, what percentage of them are on the 1-year renewal plans, and how many of them are on multi-year plans. I'm guessing the vast majority are yearly. It would be interesting how many of them never get re-registered after the first year
Headline number trend is what matters. Yeah lots of failed projects but then lots of new projects to make up for it!
I agree that it's ridiculous, but absent some sort of regulation, things are not priced based on how much they cost the provider, but based on how much people are willing to pay. Even if they're unhappy about it.
The thing is there are supposed to be regulations. .com is not privately owned but a public good that is supposed to be regulated by ICANN with the interests of the public in mind.
Just in case: you can get a .com for less than that nowadays, sometimes $3 for the first year (then transfer it back and forth for $5–7). Here are some price comparisons: https://tldes.com/com, https://tld-list.com/tld/com
I assume some registrars sell these at a loss and expect to offset that by selling you WordPress Supreme Ultra Enterprise hosting for... $40/yr? No idea how this works.
Because it's a natural monopoly. Nobody ever got taken seriously with a .biz address.
(.com is basically price-regulated because of this, FWIW, Verisign can't just raise prices whenever or however it wants. But obviously it's still a pretty sweet deal for them, I'd imagine.)
Hell, even .net will lose you traffic. If someone has your desired name with .com so that you use any other TLD, you will lose traffic. If your .com is taken by someone in the same line of work and not just a coincidental use of the same domain, then you'd be insane to not change the domain. I'm not sure how many people manually type domains in any more (I do though), and .com is muscle memory.
Sure, it's a natural monopoly, but it's owned by a non-profit (ICAAN), so where is all the money going?
I've never had to pay Namecheap extra for WHOIS protection.
They always list it in the line items and in the renewal but whatever. In fact, it looks like I forgot to turn on auto-renew on their domain privacy product so it's sitting there in the 'grace' period. They work as a registrar so I use it.
It used to be more common back then
Or you find one of the many registrars that offer free private whois, and none of these problems exist.
> The concept of WHOIS has felt sleazy for many years.
The concept of most internet things has felt sleazy for many years. Right around the time that businesses started monetizing the internet is when that feeling really kicked off tbqh
> the registrar will basically extort me a couple extra dollars per year for “domain privacy” for the privilege of not having my name, home address, phone number, and email publicly available
Your registrar is scamming you.
I was going to buy a domain back in my student days, but I stopped when I realised I didn't have a phone number. I used the public phone-box on the corner whenever I needed to actually call anyone. It was a little annoying to have to register a phone number when I didn't actually want anyone to call me.
I don't have the greatest registrar but hiding my info from whois is free
GDPR is what changed this. Before that, registrars had little incentive to hide it for free when they could instead charge you for the service. It was not trivial that Google Domains (rip) came with free privacy proxy right from the beginning.
It not so much that registrars had little incentives, but rather that GDPR defined the concept of legitimate interest as the definition for when registries should give out public information about domain ownership. That allows the contact information to still point to the correct domain owner without going through a proxy, while still creating a small hoop for parties interested to extract ownership information from the registry.
One can see this in practice in that company registration information is usually still available (through often behind a captcha), while personal information of private registrations require additional steps to demonstrate a legitimate interest. All this is also generally occurring at the registry level, rather than at the registrar.
It should be mentioned that privacy proxy is very similar to a straw man registration. If the registered owner is the proxy, then you are trusting that the proxy will honor the contract that is linking you with the property.
> GDPR
And yet all German sites must have such thing: https://0pointer.net/imprint
Only commercial websites.
https://allaboutberlin.com/guides/website-compliance-germany...
So I've walked past Lennart Poettering's house before without knowing it. (And that is not the sort of area where I'd have guessed he would live.)
If I were some kind of crazy maniac, I could pay him a visit and shut down systemd for good. You see why having this information out there is dangerous?
Not all sites, personal websites don't require an imprint AFAICS.
They do. Even your bluesky/mastodont account does.
Absolutely not, where did you get that idea?
Mastodon _instances_ have Impressumspflicht, sure. But normal users don‘t and I have never seen anything contrary about private accounts.
Edit: unless the Account is for/by a business of course.
Phone books went out to the city , the internet is full of every scammer from Bangalore to Bangladesh.
Strangely limited region of focus.
Well, traveling west
Also alliterative.
if you use a sleazy domain registrar, you get what you get. the good ones offer privacy for free.
For .pl TLD, due to GDPR, domain data is hidden by default for private individuals (as opposed to companies), yet some registrars still try to upsell the "domain privacy", hoping you don't know about it.
Note that it is being replaced with a different protocol, is there any indication that there are less stringent requirements on identity data disclosure on the new proto?
It's just a different protocol for how to send the data. It doesn't affect requirements on the data itself.
RDAP replaces WHOIS, offering a more technologically advanced way to discover the domain is protected by privacy services.
Domain whois is useless, but IP whois is at least kind of useful to check before blanket banning entire IP ranges.
Interestingly, when discussing WHOIS with my networking students, I discovered .edu WHOIS is not (cannot?) hidden. I suppose EDUCAUSE either requires WHOIS to remain open or they do not offer information hiding.
Doing some WHOIS lookups, we found a point of contact at a university, called the network admin said hello and launched into an impromptu network admin interview. It was cool stuff. I emailed him later in the day to apologize to and thank him for being a good sport about the whole thing. He (fortunately) found it all rather enjoyable.
Some other TLDs, like .us and .in, also forbid WHOIS privacy. TLD owners are free to set whatever policy they want around this. Perhaps .edu does the same.
It's useful for checking if a domain name is taken without doing that through a registrar, which is both less convenient, and (in case of shitty registrars) can be sold to domain speculators.
Depends what endpoint you hit, the look up data will likely be sold regardless.
whois/rdap is very useful to identify if a domain is registered or not, and if so with whom. still lots of use there without pii data.
Both give you a way to find out the domain's registrar, registration date, transfer status, and administrative contacts like abuse@. Nameserver data can also be somehow useful.
Otherwise, what did you expect the registrar to divulge to you, a random passer-by?
As a random passer-by I can look up the registered ownership of any building on the street.
As an Australian, I can look up the ownership of random properties in the US for free. But if I want to do the same for a building on my own street, I have to pay a US$11 fee per a property searched.
The US has a reputation of being a hypercapitalist society, yet they seem to be behind Australia in the descent into hypercapitalism by not (yet) privatising the registration of land titles. [0]
[0] https://www.abc.net.au/news/2017-04-12/$2.6-billion-price-ta...
Considering Australia (SA) invented the concept of the Torrens Title which means that we don't have to pay extra to protect a piece of paper, and that the Titles Office has always charged for access to titles, I don't think that this is the "hypercapitalism" hill to die on.
It also means that banks can't sell mortgages out from under their borrowers because all liens and other finanacial liabilities attached to a title are known.
Is that hypercapitalism or .. too much state control?
A private industry would be able to maintain the records for next to nothing by advertising or offering related services.
The govt could restrict themselves to ensuring no monopoly.
Intentionally ot not, it also prevents mass scraping.
It doesn’t because you can negotiate a bulk discount. If you want all the titles, they’ll sell that to you - for a huge fee, but still a big discount off paying for them all individually. So essentially it prevents mass scraping by individuals and small businesses, while posing no real obstacle for megacorps with megabudgets
I get the joke, but whois is super valuable for abuse report contact and for registrar and even ip block info!
Huge protocol for cybersecurity
Wow. I never noticed how much how I used the internet changed. I haven’t done a WHOIS in a decade.
When I started using the internet, it’s how I contacted people. If I liked their site or their blog, I’d check who was behind it and get an email address I could contact.
Now… humans don’t really own domains anymore. Content is so centralized. I obviously noticed this shift, but I had forgotten how I used to be able to interact with the internet.
And after you emailed them you could finger their address and see when they last checked their email, and their unread message count usually.
I had no idea this was a thing for email... Wow.
Only for Unix accounts.
My only nitpick is that humans still own domains, but I agree with the overall sentiment and thank you for sharing this perspective.
It is fascinating to consider how our experience with the internet is changing over time.
Remember phreaking? Having been born in the Netscape era, I certainly don't, but I can imagine that losing the ability to pull that trick off must have felt like a loss to those who were initiated in the art.
Thankfully the trend appears to be that new technologies and thus new 1337 h4x are still forthcoming.
I think in most ways it's better, it makes the web more approachable to less technical users, making it less gate-keepey, but I also kind of miss the loosely-coupled cluster of web pages from the late-90's and early 2000's web.
Stuff felt less homogeneous; everyone had kind of a loose understanding of HTML, and people would customize their pages in horrendously wonderful ways. It felt more personal.
So many tech people have a fondness for that time. To me, it was a very narrow slice of the human experience. Today I can find sites and communities on any subject I can conceive and billions more that I cannot.
And personally I found it more horrendously ugly than horrendously wonderful. But that's just my opinion.
Yeah, as I said in most way things are better now than they were in the rose-tinted memories of the late 90's and early 2000's. Now if you want to say something on the internet, you can open up a Substack, or a Bluesky, or a Medium, or you can find a niche Subreddit. You don't need to know anything very technical, and that's a good thing.
I'll acknowledge that the old web was ugly, even at the time. I guess I just liked how much of it was, for lack of a better word, "custom". Most people were pretty bad at HTML, common web standards really hadn't caught out outside of "make it work in Internet Explorer", and CSS really hadn't caught on, so people glued together websites the best that they could.
Most websites looked pretty bad, but they were genuine. They didn't feel like some corporation built them, they felt like they were made by actual humans, and a lot of the time, actual children. I was one of those children.
I posted about this a week ago [1], but my first foray into programming was making crappy websites. It felt cool to me that a nine year old could make and publish a website, just like the grownups could. I didn't know anything about style so I had bright green backgrounds and used marquee tags and blink tags and I believe I had a midi of the X-files theme song playing in the background.
I guess it's the same sentimentality that I have when I look at a child's terrible drawing or reading one of my old terrible essays I wrote when I was eleven years old that my mom kept around. They're bad, they're embarrassing, but they're also kind of charming.
[1] https://news.ycombinator.com/item?id=43297104
> Yeah, as I said in most way things are better now than they were in the rose-tinted memories of the late 90's and early 2000's. Now if you want to say something on the internet, you can open up a Substack, or a Bluesky, or a Medium, or you can find a niche Subreddit. You don't need to know anything very technical, and that's a good thing.
By 1999 you could create a LiveJournal or find a niche forum through Google. You didn't need to know anything very technical.
You could, Xanga as well, but it was still less connected. People complain about recommendation systems on YouTube and Facebook and Reddit, but one thing that they do well is give people more reach that they probably wouldn't have gotten before.
I've found so many interesting YouTube videos from people that I haven't ever heard of, just because of YouTube recommending them to me. Stuff like that didn't really exist for quite awhile; for a long time the best you had was aggregator sites like ThatGuyWithTheGlasses.com or similar sites.
> I think in most ways it's better, it makes the web more approachable to less technical users
There's a big gap between looking up someone's contact info using a protocol that many tools and websites implement (anyone can open www.who.is from search results) and the second example of needing an understanding of HTML to make a webpage. I don't think it's gatekeepey to be able to email the human behind a given website, whereas the current internet is full of walled gardens, gatekeepers, and faceless/supportless services (thinking of Discord, Cloudflare, and Google as respective examples)
We can have both human-run services and WYSIWYG website builders on the internet concurrently
Less gate keepey? Big Tech is literally the gatekeeper. Want to see a story without account? Too bad. What to see what events are going on without Facebook account? Too bad. Want to search discord or twitter. Too bad. Big Tech sucks in all user content and then hides it behind paywalls.
This was exactly my reaction.
I think a lot of people fail to appreciate that the alternative to big tech taking over was not keeping things exactly the same as they were 20 or 30 years ago, but developing in a different direction.
It was the direction in which people expected things to develop: decentralised and democratised. There was a lot of optimism about empowering individuals.
I sometimes use whois multiple times in a day lol.
Should it exist? Maybe not, probably not, but that doesn't stop me from using it when I want to try to do some sleuthing. Most of the time though it doesn't work because they have privacy enabled.
I did get screwed once with certain TLDs not being able to enable privacy. I had registered a .at domain to use with a video site I had that at the time was reasonably popular and going viral fairly regularly. I hadn't realized beforehand that privacy wasn't possible, but once I learned, I didn't love it, but I wasn't sure if it would matter that much. I was wrong. I was getting calls and emails regularly from random people on the internet who found our content on reddit or whatever and decided to do some sleuthing
How do you hold both of those ideas in your head at the same time?
Well, they did say it probably shouldn't exist. Also, I'm just blown away by how much people here don't consider having fake info as an option.
That works great until the TLD decides you need to hop through extended verification and fork over an identity card and a recent (3 months) invoice showing the address you signed up with 12 years ago, freezing your domain such that you can't update the information to be your current address even if you wanted to share that with the world (because privacy doesn't exist and GDPR doesn't apply in French-run/France-headquartered AFNIC). There's no time to dispute it or go back and forth: the initial email already comes with the announcement that your domain will go dark if they haven't processed your response after 14 days. Oh yeah, and you need to submit this via plain text email. If you send a link to the pdf scan, so that you can remove it after they've viewed it, that gets rejected (but it will be downloaded by an overseas system, run in the USA, within seconds of sending it), they'll respond that it specifically needs to be an attachment so that it will linger in their inbox forever
If you use fake info in relation to WHOIS data, you also need to be prepared to forge an identity document (a pretty bad felony in most countries per my understanding)
That said, on most forms I enter fake info because they they have no legitimate use for it anyway and they also can't compare it against anything. Buying a game or event ticket needs my address? For what, linking my purchase to a profile they're building? Nah, fake address it is
I use it primarily to lookup info on an IP address.
I did a Whois last week to prove to my previous registrar that I'm no longer with them, and that the invoice they sent was invalid. Unexpected use-case, but useful.
On the other hand, I did a WHOIS days ago to check up on a potential scam site my partner landed on while working on an e-commerce platform. I hope some alternative exists, people using Let's Encrypt leaves an entry in the transparency log but people don't necessarily need to use that. I haven't researched the alternatives to WHOIS yet but now I'll have to.
did you find anything useful?
> Now… humans don’t really own domains anymore.
Even when they do, it's generally a smart idea to anonymize the whois information.
You might be looking up my domain to make a buddy, but someone else might be looking up my domain to SWAT me.
Although shit did happen back in the day. Someone show up at the house of the DeviantART CEO in like... I wanna say like, mmm.. 2007? and slashed his tires etc. WhoIs was only cool in the 90s.
A big part of that is because GDPR basically murdered Whois. It hasn't been useful for many of those last ten years.
The article is titled:
> ICANN Update: Launching RDAP; Sunsetting WHOIS
Bit deceptive to editorialize it into something that sounds like something else much more interesting (removing contact info from domains) but isn't the case at all (they're just changing the method to access the same info).
Worth mentioning are two open-source RDAP projects that are helping move the internet to a more structured system:
DNSBelgium: https://github.com/DNSBelgium/rdap
RedDog: https://www.reddog.mx/home/2017/12/14/server-1.2.2-patch-rel...
https://github.com/openrdap/rdap
Golang, single binary, cross platform, download and use.
Just noticed that someone is going and down voting any mention of any implementations of rdap clients for this news item. Very strange.
Whois it ;)
I’m assuming this is a client app, and not a server implementation.
Yes, it's a rdap client, command line
I like WHOIS with its extreme simplicity [0]. RDAP, on the other hand, works on top of a large and changing HTTP [1], and uses a JS-derived serialization format [2]. RDAP has advantages, such as optionally benefiting from TLS, the data being better structured and defined, but the cost in added complexity seems high.
[0] https://datatracker.ietf.org/doc/html/rfc3912
[1] https://datatracker.ietf.org/doc/html/rfc9082
[1] https://datatracker.ietf.org/doc/html/rfc9083
It's a bit unreasonable, IMO, to criticize the fact that RDAP communicates using a JSON API -- while JSON is inexorably related to JavaScript (and it's not without its issues), it's ubiquitous on the modern web for serializing data, in any even vaguely REST-shaped API.
You could argue that a more compact, binary, wire format is more appropriate (though I wouldn't, in this case, since for small, simple payloads, I think simplicity and human readability trumps sheer wire efficiency). You could argue that JSON's a poor serialization language in general (which is debatable, contextual, and in this case, I don't think there's a widely-accepted better option).
But let's not act like "a JS-derived serialization format" is some kind of mark of the beast here.
As far as I can see, an RDAP request is a simple HTTP request, looking like http://example.com/rdap/ip/192.0.2.0. Web servers still support HTTP/1.1 (or probably even HTTP/1.0 and HTTP/0.9). This is trivial to implement for clients. A simple HTTP request like that is about he simplest thing to do. You'll have to use curl or wget instead of netcat if you want to do it manually. No big deal.
"A JS-derived serialization format" ... You mean JSON, which is about the lowest common denominator in Internet data exchange these days (and has been ever since we found out that XML was overly complex and JSON was much easier to use). You'll have to use something like jq instead of grep to extract information from the data manually. Or rather, you'll be able to use the powers of jq. Again, I don't really see the problem here.
I did not mean that there is a problem with it, only that I appreciate the simplicity of WHOIS. While HTTP-with-JSON is perhaps the most practical solution these days.
To clarify my point of view, an ad hoc HTTP client for this indeed should not be hard to write from scratch, demonstrating that there is not much complexity in that. The server part would be a little more tricky; still doable, but not as easily as for WHOIS, and in most cases a more sensible approach would be to use libraries (or a program like curl, in case of shell scripting or manual usage) for that, as you said. Likewise with JSON: though one can deal with it as with text, some added tools (a library or jq, depending on context) would be sensible to use. But then added dependencies lead to all kinds of issues in non-ideal conditions (e.g., when it is problematic to install those). But again, I am not saying that this should stop adoption of RDAP.
On top of that, a complete and proper HTTP 1.1 implementation, server or client, would be quite large. And JSON, while indeed common and not particularly complicated, still has bits I find awkward (no sum types or a standard way to encode those, but has "objects", arbitrary-looking primitive types; no single standard for streaming, either), so working around it is not exactly pleasurable. Those add up to a difference between a trivial protocol and, well, a non-trivial one. I appreciate such trivial yet working and useful solutions, though the other kind is commonly useful as well.
Also, a large number of command line RDAP clients output plain text instead of JSON if you ask nicely.
Most people won't even notice this change. They'll still go to a "whois lookup service" and input a domain, and get the same results. The fact that it arrived via a different protocol (RDAP) won't mean anything.
The linked page (https://lookup.icann.org/en) seems to work only for .com domains?
"No registry RDAP server was identified for this domain. Attempting lookup using WHOIS service."
"Failed to perform lookup using WHOIS service: TLD_NOT_SUPPORTED."
Not just .com, it also works for .org, .app, .dev, etc.
As suggested by another comment, it looks like not all ccTLDs support RDAP. For example, .io does not.
To be replaced with a system providing a standardized method to give law enforcement easier "secure access" to your redacted personal information.
Wait, people use real information?
I do. The terms of the domain registration say that providing incorrect information can result in revocation of the registration. Not really worth the risk, IMO, for any domain I actually care about.
Not just that, but also if the registrar turns out to be fraudulent or someone convinces your registrar to transfer the domain (scam the support team), or they get your account password and transfer the domain that way (data leak elsewhere, password reset with a sim swap, you name it)... there are so many ways you can have "technical difficulties", but in the end: you're the one with an ID card that has your name on it. You can take the TLD to court and have them give you back the domain that was legally purchased in your name
Except if it's not in your name
So yep, as you say: make this decision (fake or real information) knowing the risks involved in not legally owning it
that's grounds for cancellation of a domain sooooo.....
We have ownership records for real estate for a reason. Domains need some level of accountability.
I'm not sure this follows. You're allowed to publish, say, a book or pamphlet without signing it with your legal name and address. So is a website more like a book, or a building?
Somewhere in the middle IMO. If the domain name is desirable it looks more like a building, because people generally care about who owns the land when it is not getting put to good use.
Websites are more like books when they have a domain no else else cares about.
So, maybe require official ID/address/contact info for any domain over a certain price? Or for all domains under a certain character count, maybe, which could vary for TLD.
How do you determine the value of a domain name? Also there's nothing particularly valuable for most short domain name strings except on .com. It's generic words that tend to be valuable, not a short random string.
Domains point to IPs, and IPs already have subpoenable ownership records at RIRs. In the real estate metaphor: we have property ownership records, but we don't have records of every rental tenancy.
That's not true. Those are registration records NOT ownership records. People do not purchase ip address or domains. They register them for temporary use.
For non-legacy allocations, point taken (but my original comment still stands if you replace "ownership" with "registration"). For legacy allocations, it's more complicated.
See: Are IP Address Allocations Property? (2014) https://www.ethanheilman.com/x/19/index.html
This is completely untrue.
https://www.zdnet.com/home-and-office/networking/court-rules...
This article is not inconsistent with my comment. The court rejected a subpoena against the ISP for the identity of the user of the IP, not against the RIR for the identity of the owner of the IP. This is like the court rejecting a subpoena against the landlord for their tenant's identity.
ICANN accredited domain registrars (so any registrar selling generic TLDs like .org, .com, .design etc) have contractual obligations related to technical abuses like phishing, malware, and botnets, insofar as they intersect with a domain name.
Content/expression related harms are outside of ICANNs bylaws and any obligations related to what a domain points at are not from ICANN, but from the laws in the jurisdiction in which the registrar operates. This is generally good. There is no global standard for acceptable limits on expression, with the possible exception of CSAM which is illegal everywhere.
Requiring domain registrars to arbitrate what content should be accessible via the DNS is perilous.
No they don't.
"Only law enforcement" is still better than "everyone".
I disagree. Law enforcement already abuses many data sources they have private access to, and use asymmetric information to their advantage.
rdap is nice when it's available.
(or brew install, etc., depending on your os and tooling). The jq formatted output is a little more verbose than the whois one, but three cheers for a well-specified machine-parsable format. (and rdap has a pretty-printed format output also)Back in 2014, when TLD .church was introduced, me and my friends tried to register alonzo.church and (ab)use the contact information records to provide some biographic information and links, explaining literally whois alonzo.church on the command line. That would not prevent hosting whatever services on that domain as normal.
Sadly, we were not able to secure the domain on time, and after 11 years, the attempted trick is becoming irrelevant.
It doesn't work with yandex.kz. Someone call Kazakhstan.
> No registry RDAP server was identified for this domain. Attempting lookup using WHOIS service.
> Failed to perform lookup using WHOIS service: TLD_NOT_SUPPORTED.
If distribution packages don't abstract this trivia away I'm going to be endlessly frustrated
I don't play with domains all day, but this very much feels like nothing important was accomplished, and things are just being made more complicated for political reasons. Sorry if that is being harsh, but I've never had any issue using WHOIS.
If you've ever tried to parse WHOIS programmatically, you'd realize that it being an unstructured blob of text is actually quite unconducive to it being useful. Having every endpoint return a standardized JSON payload specified in an RFC is much better.
Better for whom?
I've had domains registered for over 30 years. I liked WHOIS because it provided a means to report abuse, which has gone from zero 30 years ago, to massive amounts of daily spam and network probes. I was not happy when ICANN began to allow privacy features in domain registration data, and I never made mine private. Most reputable sites still provide contact information via WHOIS.
Hopefully RDAP will be a suitable replacement. I haven't tried it yet.
RDAP is just a different format for WHOIS data.
> I was not happy when ICANN began to allow privacy features in domain registration data, and I never made mine private
The issue for me is that you can't simply publish contact information. It requires you to either publish a legal owner in full or nothing. I can't publish abuse@example.org as contact method (because, yes, I do want to receive an email if someone finds an issue with my services), I need to publish also a legal name, address, sometimes a phone number. Those things cost money to set up to be fake-but-legit (burner SIM card, rent a letterbox somewhere, get someone else to submit their name and ID card) whereas an email address is inconsequential to publish and I can rotate it monthly to avoid it becoming enrolled on too many spam lists
So my sites never provided contact info via WHOIS when I could avoid it, yet I'd think my sites are as reputable as they come. You can always find a plain old email address via some link on the homepage and I have no spam filter (just email address rotation) so there is no chance that you're algorithmically filtered out, either
I can't publish abuse@example.org as contact method
For what it's worth one can publish that email address in their DNS zone SOA record. Some people will figure it out.
In the case of YC they defer to the AWS dns admins but you can set it to whatever you want unless your DNS provider does not let you. I've always run my own DNS so maybe that's less of an option for hosted DNS these days for all I know.I have no doubt some of the benefits are definitely to be able to resell or access that data once again. I literally just told someone yesterday “don’t pay for domain privacy, any registrar worth a damn will include it anymore”
There's something about WHOIS I've never understood. If you run `whois ycombinator.com` you'll see name servers in the output.
But if you run `dig ycombinator.com ANY +noall +answer` you'll see name servers here too. If you see all the output together, you'll find the same name servers are present in WHOIS output and the DNS NS records. But wait, there's more.The name server `ns-225.awsdns-28.com` is present three times- in WHOIS, in DNS NS records, in DNS SOA record.
Which of these name servers get used to resolve `ycombinator.com` to its IP address like when I do `ping ycombinator.com`?
What if the information between the WHOIS and DNS NS records and the DNS SOA records are inconsistent? Which record wins?
If you `ping`, your recursive resolve (like Google DNS, or your ISP DNS servers) will do the recursive lookup for you.
WHOIS data are irrelavant to resolving the host IP address. The SOA will be used to find the primary name server (for an AXFR lookup perhaps), but generally, each NS entry will work in a round-robin fashion and SOA isn't queried.
Most resolves just ignore duplicate records, but I imagine some resolvers may change the "odds" to likely pick the duplicated NS entry.
Finally, most authorative resolvers do not want to spend resources on ANY queries and almost always don't return all records, or like you saw, do not de-duplicate answers.
Thanks! Do you know why the name servers are part of the WHOIS data?
Same question for SOA record. If the NS entries are used in a round-robin fashion, why is the name server present in SOA record too?
> Do you know why the name servers are part of the WHOIS data?
The NS returned from the registrar's WHOIS server reflects the registrar's view; the NS returned from the TLD nameservers reflects the registry's view; the NS returned from the zone's authoritative nameservers reflects the registrant's view. These should typically be the same, but can differ.
> why is the name server present in SOA record too?
The NS in the SOA record is used for RFC2136 dynamic updates and RFC1996 zone replication.
That's the clearest explanation I've ever seen, thanks.
If you're trying to debug why a website's setup isn't working, the first step is to see if what the registry thinks the nameservers should be matches what the nameservers in DNS actually are. These can fall out of sync if e.g. the registry's connection to its DNS provider is experiencing issues. This does actually happen from time to time.
The NS record wins. The data in WHOIS is just non-operational metadata, WHOIS is not used for lookups.
Which server gets used is usually randomized from the set of possible ones. Same for which of multiple A or AAAA records are used to connect to.
Us sysadmins would love to be able to specify weights or round robin or retries (like with SRV records) to move load balancing and failover to the clientside but for whatever reason browser vendors have rejected this for years.
In practice it will round-robin because all of those guys have the same performance characteristics but through whoever else is upstream of you in the DNS chain. The SOA isn't used for resolution so it doesn't matter there.
> In practice it will round-robin
Which data though? Is it the WHOIS name server data that is used for round-robin? Or the DNS NS record data?
Do you know why the name server is present in SOA if it isn't used?
The NS records and the WHOIS should be the same usually. One comes from the registrar's configs and the other from your next level upstream resolver (which should, unless it's cached and a recent change happened, be the same). But the thing that is used is whatever your next level upstream resolver is, which is the `dig` output unless you did `dig @someoneelse`.
The SOA nameserver is pretty much only significant for DNSSEC these days. In the AWS case there, I don't think it does anything unique. Pretty much there just to meet the standard.
I remember in the past I've managed to screw up my setup so that the name servers on WHOIS and name servers on DNS NS records mismatched. I can't remember which record won during name resolution.
I guess I still don't understand why the name servers need to be both in WHOIS records and DNS NS records. Does the name resolution use the name server data in WHOIS records in any form or manner?
In short, name resolution does not use the records in WHOIS.
Think of the WHOIS information as more of an administrative database, and the actual DNS servers (which are located at the location of the NS records) as the operational database.
It is useful to know, in your administrative database, how to get to the organisational database, but it does not hold all of the information -- just where it is located.
In operational contexts (actual DNS lookups), you only use the operational database (the nameservers).
In administrative contexts (transferring a domain between registrars), you use the information from the administrative database (WHOIS).
There are additional wrinkles, like GLUE records, but those are probably a bit beyond the scope of what you're asking.
People say WHOIS is useless these days due to WHOIS privacy, but it's useful for at least one thing: checking when a domain was registered/transferred. Fishy stuff tend to be registered/transferred recently. Also older and larger companies tend to not hide their organizational identity.
Btw, I tried the icann-rdap CLI tool and the default rendered-markdown output mode is atrocious. Sea of output, each nameserver has one or more standalone tables taking up 15x$repetition lines, almost impossible to fish out useful info. The retro gtld-whois mode is so much cleaner. Their web tool https://lookup.icann.org/en/lookup is fine too, don't know why the rendered markdown mode isn't like that. WTF.
I like the `rdap` cli from https://www.openrdap.org (in Brew, too: https://formulae.brew.sh/formula/rdap#default). Very clean, concise output.
Good bye, then, whois.
I can remember times when you could still see the names and addresses of registrants in whois records. That was before abuse and fraud became everyday occurrences in today's internet.
I miss the times when we could still believe in basic human decency.
basic human decency is not incompatible with there being a very small but spammy minority.
though it seems this belief is tested on multiple fronts nowadays.
If it's not against the law, we can do it, right? /s
Are existing whois-clients going to be updated to support RDAP next to Whois, or will we have to use different clients?
I just did an
on a Debian (well, Devuan) system, and found nothing. Also could not find that phrase in the name of any executable in /usr/bin or /usr/sbin .:-(
The linked article points to a GitHub repo. Clone it and do `cargo install icann-rdap-cli`. Of course you need the Rust toolchain for that.
It's in experimental:
https://packages.debian.org/experimental/rdap
The main benefit of whois and RDAP is to see which registrar handles a domain and when there were recent changes or upcoming expiry etc. RDAP is also useful to see who operates an IP address etc. I've been using RDAP for a few years but the service has been spotty, hopefully that improves now.
Why isn't this data simply available as a custom DNS record type?
Seems far simpler than a whole custom protocol.
How would that work? Would your DNS server delegate that one field out to your registrar?
Why does your registrar need to proxy this? How it could work is that you simply create
Or some other class, so that keys can't clash (already exist), similar to how version.bind is partitioned into the chaos classBecause you can just lie much more easily than your registrar can.
The domain owner already provides this information to the registrar and is responsible for keeping it up to date.
I don't understand how this relates to how easy it is to lie to your DNS server.
An E-Mail address is already in the SOA record.
When can I finally see an article announcing that ICANN has been sunsetted?
Why so flippant? The Internet would be in a sorry state without ICANN...
Can you explain more?
One bright side of ICANN being a California non-profit is that when they tried to sell off .org to their own confederates so they could juice up the prices they were stopped from doing it. If they were in other places, I imagine it would have gone through.
it was fun when having a network solutions/internic contact handle was a badge of honor.
the early internet was fun. whois was always a fun dimension.
is there a canonical rdap client that will end up everywhere? one of the nice things about the early Internet was that there were canonical utilities that were everywhere.
Whois needs it's own port open usually, this is good I suppose, now it's all HTTPS. Now, if only passive dns resolution data was part of this same api. As it stands today, if you're looking into WHOIS information, historical WHOIS and passive dns are a must, and they are usually provided by commercial entities.
My favourite part of my .ca domains is that personal data is protected by default and I don't have to pay for it as an additional service.
There's no need for people to know my information because I happen to own a domain.
What does this mean for the command line tool whois? It definitely works still and it's still being updated...
> whois ycombinator.com % IANA WHOIS server % for more information on IANA, visit http://www.iana.org % This query returned 1 object
refer: whois.verisign-grs.com
domain: COM
organisation: VeriSign Global Registry Services address: 12061 Bluemont Way address: Reston VA 20190 address: United States of America (the)
contact: administrative name: Registry Customer Service organisation: VeriSign Global Registry Services address: 12061 Bluemont Way address: Reston VA 20190 address: United States of America (the) phone: +1 703 925-6999 fax-no: +1 703 948 3978 e-mail: info@verisign-grs.com
contact: technical name: Registry Customer Service organisation: VeriSign Global Registry Services address: 12061 Bluemont Way address: Reston VA 20190 address: United States of America (the) phone: +1 703 925-6999 fax-no: +1 703 948 3978 e-mail: info@verisign-grs.com
nserver: A.GTLD-SERVERS.NET 192.5.6.30 2001:503:a83e:0:0:0:2:30 nserver: B.GTLD-SERVERS.NET 192.33.14.30 2001:503:231d:0:0:0:2:30 nserver: C.GTLD-SERVERS.NET 192.26.92.30 2001:503:83eb:0:0:0:0:30 nserver: D.GTLD-SERVERS.NET 192.31.80.30 2001:500:856e:0:0:0:0:30 nserver: E.GTLD-SERVERS.NET 192.12.94.30 2001:502:1ca1:0:0:0:0:30 nserver: F.GTLD-SERVERS.NET 192.35.51.30 2001:503:d414:0:0:0:0:30 nserver: G.GTLD-SERVERS.NET 192.42.93.30 2001:503:eea3:0:0:0:0:30 nserver: H.GTLD-SERVERS.NET 192.54.112.30 2001:502:8cc:0:0:0:0:30 nserver: I.GTLD-SERVERS.NET 192.43.172.30 2001:503:39c1:0:0:0:0:30 nserver: J.GTLD-SERVERS.NET 192.48.79.30 2001:502:7094:0:0:0:0:30 nserver: K.GTLD-SERVERS.NET 192.52.178.30 2001:503:d2d:0:0:0:0:30 nserver: L.GTLD-SERVERS.NET 192.41.162.30 2001:500:d937:0:0:0:0:30 nserver: M.GTLD-SERVERS.NET 192.55.83.30 2001:501:b1f9:0:0:0:0:30 ds-rdata: 19718 13 2 8acbb0cd28f41250a80a491389424d341522d946b0da0c0291f2d3d771d7805a
whois: whois.verisign-grs.com
status: ACTIVE remarks: Registration information: http://www.verisigninc.com
created: 1985-01-01 changed: 2023-12-07 source: IANA
# whois.verisign-grs.com
>>> Last update of whois database: 2025-03-17T01:27:31Z <<<It has already stopped working for domains on TLDs that have sunset WHOIS and over the next few months it'll stop working for a lot more TLDs and registrars. The command line tool is nothing more than a thin client that queries a server WHOIS endpoint.
I would have taken your word for it
This is what it means:
$ rdapper ycombinator.com # cf. https://github.com/gbxyz/rdapper
Handle : 147225527_DOMAIN_COM-VRSN Status : client transfer prohibited secureDNS : {"secureDNS":{"delegationSigned":false}} objectClassName : domain ldhName : YCOMBINATOR.COM nameservers : {"nameservers":[{"ldhName":"NS-1411.AWSDNS-48.ORG","objectClassName":"nameserver"},{"ldhName":"NS-1914.AWSDNS-47.CO.UK","objectClassName":"nameserver"},{"ldhName":"NS-225.AWSDNS-28.COM","objectClassName":"nameserver"},{"ldhName":"NS-556.AWSDNS-05.NET","objectClassName":"nameserver"}]} events : {"events":[{"eventDate":"2005-03-20T23:51:07Z","eventAction":"registration"},{"eventAction":"expiration","eventDate":"2026-03-20T22:51:07Z"},{"eventDate":"2025-02-14T02:53:36Z","eventAction":"last changed"},{"eventDate":"2025-03-17T01:38:05Z","eventAction":"last update of RDAP database"}]}
================================ Terms of Use ================================
Service subject to Terms of Use.
================================ Status Codes ================================
For more information on domain status codes, please visit https://icann.org/epp
======================= RDDS Inaccuracy Complaint Form =======================
URL of the ICANN RDDS Inaccuracy Complaint Form: https://icann.org/wicf
Edit: Fixed formatting of command line/comment.
Glad I read this, I wasn't aware whois was being sunsetted. Now I have to change one of my critical services to do rdap. Wow. How can you sunset the main service that is the backbone of the internet?
I wonder which other old internet protocols fell into obsolescence.
Finger is not officially retired but no one supports it. NNTP seems it had a similar fate.
Missed opportunity to call the successor `whodat`
Stoked to see that ICANN reference implementations are now being written in rust!
https://github.com/icann/icann-rdap
From what I've seen most domain servers don't really implement the history components of RDAP, which is a shame - being able to see if a domain ownership lapsed or was transferred historically would be great for being able to determine if somebody's email address is still trustworthy or has been stolen by a domain transfer.
this really looks like a regression. In the sense that RDAP could be cheated
I wasn't aware of rdap.
Anyone experienced with this, I am not seeing abuse contact info, usually a phone number or email. Am i supposed to follow hyperlinks to get this info or something? Like search the registrar for this data?
looks bad. I see a loss in trust there
I hope archive.org will host a WHOWAS service.
ICANN's DNS servers is one of the only systems on the internet that requires people to continually pay money to have a name. X, YouTube, Facebook, Reddit, Twitch, etc all let you register a name for free and without submitting all of your personal information. The entire model here is outdated with what users want.
i’m glad it requires money. $1/month for a top level name isn’t much, and it means there are lots of good names available rather than all of them being grabbed by someone not interested in using them. when making a reddit account it’s actually pretty tricky to find a decent name that’a available
I think both models have a place. Sometimes I just really want a persistent identifier that I can take with me (unlike an IP) with minimal maintenance. Even if it is something unreadable like a UUID.
We should totally have a free .uuid TLD (which will predictably get blocked by 90% of networks... Although DoH would probably still work)
Twitch for example will allow you take over usernames of accounts that are unused. Also having a good name is less important than you think. Most people don't navigate by going to exact identifiers. They just type the name of the thing into a search and relevant results will be returned. Dead or useless results should not rank high.
> continually pay money to have a name
...and to host associated services to resolve this name to an IP address, as well as administrative overhead
I'd rather not that my domain name is funded by ads and sponsorships, the way that "X, YouTube, Facebook, Reddit, Twitch, etc all" are (no love for open source or decentralised platforms btw? The more commercial the better, except when it costs you money?)
These days how can one register a domain anonymously, using crypto as payment, and without KYC?
Njalla is the only service I'm familiar with: https://njal.la/
No first hand experience, however.
They’re pretty expensive, and the nature of the service means that if they disappear, they have ownership of your domain and you have no recourse to get it back.
That's the nature of 'private' domain registration used more commonly, at least to some degree for many private registrations. If you read the agreement, you are transferring your domain registration to the privacy service, and they forward stuff to you. I don't know what happens if they disappear, however.
Worse: if Njalla decides you shouldn't have a domain - for any reason whatsoever, including "we don't like your web site" - they can seize it, and you have no legal recourse.
This is not a hypothetical, by the way.
It could be useful just as a landing page to direct users to a .bit domain.
You mean the "domains" that >99% of users can't even resolve, which can't be used to send or receive email, and which you can't have SSL certificates issued for? Don't be daft.
A self-signed SSL cert could work for it. There may also exist other solutions that we are not aware of.
99% of the target users will resolve it if they want access (by installing the necessary browser extension).
As for system emails, etc., they can come through any regular domain.
It'll be a pick one problem (secrecy, control) until say the big browsers support .bit domains directly doing a lookup on the block chain.
Porkbun accepts crypto
I can't really understand the desire to be completely anonymous. Isn't GDPR protection or some paid privacy protection enough?
wow! something I didn't expect to read today, or in the near future.
r dap me up
check out the rdap deployment dashboard - https://deployment.rdap.org/
it's still unsupported by a lot of tld's and the rate limits are atrocious. some registrar's only allow 10 requests per day and will group huge netblocks into one single block.
I havent had a successful use of whois in probably over a decade. What was once a useful tool was destroyed by spammers harvesting email addresses and privacy oriented registrars.
I won't even notice its gone
It is useful for finding out who owns an IP address, but that's about it.
BGP looking glasses are still a thing, so at least we have that.
Whois -r lookups I use Arin for.
Maybe I'm confused but whois gave me domain owner, but whois -r gave me Arin IP netblock ownership.
Arin is useful, whois is not.
This seems like it would break things.
I’m serious! I don’t know why we’re turning a fundamental command off, even if it didn’t work correctly for everything. Do you realize how much documentation and how many tools reference it? And it still can work.
I think people know you're serious (I'm not one of the downvoters), but that it seems silly to stop all innovation for the sake of legacy
https://en.wikipedia.org/wiki/Protocol_ossification is a big enough problem that we're being taught about this in school so that we're aware of the problem and maybe things get better in the future
shocking news indeed.
Major regression. How can we trust the internet now ?