At first I wondered if musl does it better, so I checked, and the version I have disables cancellation in the guts of `getaddrinfo`.
I've always thought APIs like `pthread_cancel` are too nasty to use. Glad to see well documented evidence of my crank opinion
pengaru · 6m ago
The asynchronous cancellation in particular is difficult to use correctly, but is also one of the most useful aspects of the api in situations where appropriate.
Imagine cpu-bound worker threads that do nothing but consume work via condition variables and spend long periods of time in hot compute-only loops working on said work... Instead of adding a conditional in the compute you're probably not interested in slowing down at all, you turn on async cancellation and pthread_cancel() the workers when you need to interrupt what's going on.
But it's worth noting pthread_cancel() is also rarely supported anywhere outside first-class pthreads-capable systems like modern linux. So if you have any intention of running elsewhere, forget about it. Thread cancellation support in general is actually somewhat rare IME.
rwmj · 45m ago
Netscape used to start a new thread (or maybe it was a subprocess?) to handle DNS lookups, because the API at the time (gethostbyname) was blocking. It's kind of amazing that we're 30 years on and this is still a problem.
nly · 38m ago
If you want DNS resolution to obey user/system preferences then you need to use the system provided API
rwmj · 34m ago
For sure! The only problem is there should be a non-blocking system-provided API and there isn't.
foota · 27m ago
System provided is maybe a strange word to use here since getaddrinfo is a libc function, not a system call.
rwmj · 25m ago
POSIX as the system, of course.
silon42 · 38m ago
As long as broken APIs exist, they will be problematic... they really should be deprecated.
Calling a separate (non-cancellable) thread to perform the lookup sounds a like viable solution...
Someone · 19m ago
> Then it needs to sort them if there is more than one address. And in order to do that it needs to read /etc/gai.conf
I don’t see why glibc would have to do that inside a call to getaddrinfo. can’t it do that once at library initialization? If it has to react to changes to that file while a process is running, couldn’t it have a separate thread for polling that file for changes, or use inotify for a separate thread to be called when it changes? Swapping in the new config atomically might be problematic, but I would think that is solvable.
Even ignoring the issue mentioned it seems wasteful to open, parse, and close that file repeatedly.
loeg · 6m ago
I think the libc people might argue this level of functionality is just outside the scope of libc. (Arguably, it is a mistake for DNS to be part of libc, given how complicated it is.)
ComputerGuru · 2m ago
To be sure, complexity isn’t the determinator for whether something is or isn’t in scope for libc though.
jart · 13m ago
Why can't they help fix the C library in question? Cancelation is really tricky to implement for the C library author. It's one of those concepts that, like fork, has implications that pervade everything. Please give your C library maintainers a little leeway if they get cancelation wrong. Especially if it's just a memory leak.
nly · 36m ago
Why is running the DNS resolution thread a problem? It should be dequeuing resolution requests and pushing responses and sleeping when there is nothing to do
When someone kills off the curl context surely you simply set a suicide flag on the thread and wake it up so it can be joined.
foota · 30m ago
The thread started sounds like it's single use, not a thread handling requests in a loop. Anyway, a single thread handling requests in a loop would serialize these DNS lookups which if they're hanging would be problematic.
loeg · 5m ago
Yes, but why? As GP notes, the thread doesn't have to be single-use.
rwmj · 33m ago
One problem may be that fork() kills background threads, so now any program that uses libcurl + fork has to have a new API to restart the DNS thread (or use posix_atfork which is a big PITA), and that might break existing programs using curl.
ComputerGuru · 8m ago
It’s not too much of an exaggeration to say that everything about using fork() instead of vfork() plus exec() is essentially fundamentally broken in modern osdev without a whole stack of hacks to try and patch individual issues one-by-one.
loeg · 4m ago
A surmountable problem, sure.
gary_0 · 12m ago
> c-ares ... will not be able to do everything that glibc does.
Does anyone have any idea what things they're referring to here?
There might be a way to getaddrinfo asynchronously with io_uring by now. Otherwise just call the synchronous version in another thread and let it time out so the thread exits normally, right? Why bother with pthread_cancel?
loeg · 3m ago
io_uring is for calling kernel APIs; this is a userspace API.
Aardwolf · 37m ago
Maybe this is naive, but could there just be some amount of worker threads that run forever, wait for and take jobs when needed, and message when the jobs are done? Don't need to be canceled, don't block
danappelxx · 21m ago
If the DNS resolution call blocks the thread, then you need N worker threads to perform N DNS calls. Threads aren’t free, so this is suboptimal. OTOH some thread pools e.g. libdispatch on Apple operating systems will spawn new threads on demand to prevent starvation, so this _can_ be viable. Though of course this can lead to thread explosion which may be even more problematic depending on the use case. In libcurl’s situation, spawning a million threads is probably even worse than a memory leak, which is worse than long timeouts.
In general, what you really want is for the API call to be nonblocking so you’re not forced to burn a thread.
ComputerGuru · 4m ago
This is, essentially, what the previous (largely pathetic) excuse for true asynchronous I/O on Linux did with the libc aio(7) interface to essentially fake support for truly asynchronous file IO. It wasn’t great.
I've always thought APIs like `pthread_cancel` are too nasty to use. Glad to see well documented evidence of my crank opinion
Imagine cpu-bound worker threads that do nothing but consume work via condition variables and spend long periods of time in hot compute-only loops working on said work... Instead of adding a conditional in the compute you're probably not interested in slowing down at all, you turn on async cancellation and pthread_cancel() the workers when you need to interrupt what's going on.
But it's worth noting pthread_cancel() is also rarely supported anywhere outside first-class pthreads-capable systems like modern linux. So if you have any intention of running elsewhere, forget about it. Thread cancellation support in general is actually somewhat rare IME.
Calling a separate (non-cancellable) thread to perform the lookup sounds a like viable solution...
I don’t see why glibc would have to do that inside a call to getaddrinfo. can’t it do that once at library initialization? If it has to react to changes to that file while a process is running, couldn’t it have a separate thread for polling that file for changes, or use inotify for a separate thread to be called when it changes? Swapping in the new config atomically might be problematic, but I would think that is solvable.
Even ignoring the issue mentioned it seems wasteful to open, parse, and close that file repeatedly.
When someone kills off the curl context surely you simply set a suicide flag on the thread and wake it up so it can be joined.
Does anyone have any idea what things they're referring to here?
In general, what you really want is for the API call to be nonblocking so you’re not forced to burn a thread.