Anna's Archive: An Update from the Team

369 jerheinze 110 8/18/2025, 4:31:48 PM annas-archive.org ↗

Comments (110)

ofou · 44m ago
Shadow libraries maintainers deserve a Nobel prize for their contributions to humanity. Satoshi would be proud.

No comments yet

vlade11115 · 1h ago
Also, they provide a torrents list that anyone can seed and be part of the long-term preservation.

https://annas-archive.org/torrents

boombapoom · 1h ago
fuck those guys, annas archive is one of the last good things about the internet.
Koshkin · 34m ago
> the last good things

Last but not least?

lysace · 45m ago
1. Information wants to be free. :-)

2. I used to think that way about The Pirate Bay guys until they hacked into the Swedish equivalency of the US social security number database and then fled to Cambodia. (Or did it from Cambodia. I don’t remember the exact timeline.)

What I mean to say is: I have been disappointed by my heroes before.

Personally I suspect Anna’s archive to be funded be Russia as a part of their ”cold” war with the west. They are literally burning down giant commercial buildings in Europe. This seems like a no-brainer in comparison, in a risk vs benefit calculation.

tzs · 33m ago
If #1 is a reference to a famous quote from Steward Brand, founder of the Whole Earth Catalog, it's only part of the quote. The rest is relevant:

> On the one hand you have—the point you’re making Woz—is that information sort of wants to be expensive because it is so valuable—the right information in the right place just changes your life. On the other hand, information almost wants to be free because the costs of getting it out is getting lower and lower all of the time. So you have these two things fighting against each other

He stated later more succinctly:

> Information Wants To Be Free. Information also wants to be expensive. ...That tension will not go away

crowcroft · 16m ago
What are social security numbers if not just another bit of information that wants to be free?

Or perhaps you are saying that people that have an interest in the availability of particular information should have some control on that information's freedom...

Ar-Curunir · 33m ago
People can do good things and bad things simultaneously. Unless me supporting the good things directly enables also the bad things, I don't see a reason to throw out the good thing.
Davidzheng · 41m ago
was the alternative for the pirate bay people jailtime?
gjsman-1000 · 44m ago
> Information should be free

I'm sick and tired of this misquote; as it was merely an observation of trends, and was never meant to be a moral maxim or mandate. If you truly believe information needs to be free as a moral mandate, share your company's source code first.

danielPort9 · 40m ago
I see it as “everyone deserves respect”. No need to overanalyse it. It’s one of those few things in life that are simply true, no proof needed.
renewiltord · 18m ago
I see it as "Carthage must be destroyed". No need to nitpick it. We must destroy Carthage.
justin66 · 39m ago
"Anna’s Archive itself has organized some of the largest scrapes: we acquired tens of millions of files from IA Controlled Digital Lending"

Not really helping in the big picture, here, guys.

thorn · 1h ago
Kudos to the team behind this project! It looks like they have improved UI in last year. The crucial problem right now is to remain accessible or to survive. I have no idea how much effort is being put into it. I wonder is it possible to remain afloat despite all efforts to take them down?
jauntywundrkind · 44m ago
There was a pretty major UI update in the past 2-5 days-ish.

Apologies for the minor grumble, but on mobile I used to be able to browse search results much more effectively; the new design only fits ~4-5 results on a screen.

freefaler · 6m ago
BTW, this is very useful:

https://open-slum.org/

cakealert · 23m ago
Can Anna's Archive claim to be a non-profit when it's effectively an illegal enterprise with unknown controllers?

They are even offering decent bounties: https://software.annas-archive.li/AnnaArchivist/annas-archiv...

Whoever is running it must be doing really well for themselves laundering all that crypto.

Also interestingly they don't offer a tor onion service, while the admin is most certainly technically competent to administer one given that he no doubt uses tor to insulate himself from his enterprise and launder crypto. What is the reasoning for that?

teraflop · 17m ago
Your comment seems like a non sequitur to me. Whether something is a "non-profit" has nothing to do with whether it receives or spends money. (See, e.g. the American Red Cross's ~$4B/yr budget.) It's about what it does with the money it has.

Obviously, since Anna's Archive is breaking the law, it can't conform itself to the normal legal/regulatory system that governs non-profit organizations. It can certainly still claim to be acting in the spirit of a non-profit, and it's up to you to decide whether you trust that claim. Nobody's forcing you to give them money.

gowld · 49s ago
Is Cosa Nostra a non-profit? The question doesn't make sense. It's a category error.

A non-profit is a corporate legal structure. An unregistered organization could be a cabal, a gang, a syndicate, a fellowship, a religion, a movement, a private club, or something else.

cakealert · 13m ago
The connotation of a non-profit is that it's being audited. It would be extremely silly to suggest otherwise.
badlibrarian · 2m ago
Audits have nothing to do with it; all entities are subject to audit.

The primary difference between a non-profit and a for-profit is that a non-profit does not distribute profit to shareholders, including the founders.

addaon · 6m ago
> The connotation of a non-profit is that it's being audited.

This is very geography-specific. In the US, 501(c)(3)s (what most people think of when they say "non-profit" where I am) have no general requirement for audits. There's also plenty of non-profit-by-some-definition organizations that never file a Form 1023, giving up some benefits of the 501(c)(3) regulations but in exchange being even less regulated.

teraflop · 9m ago
It may have that connotation to you, but in general (at least in the US) non-profit organizations are not required to have independent audits. Typically, that requirement only happens if they receive a certain amount of government funding. An organization may choose to undergo audits in order to make people feel better about donating to it.

I really, really don't think that anybody is being fooled or misled into thinking that Anna's Archive is a "legitimate" audited organization when they describe themselves as a non-profit.

SimianSci · 12m ago
Given the amount of hosting and storage needed to sustain this project. Nobody is getting rich off of donations. Not to mention the lifestyle tradeoffs that innevitably come with international fugitive status do not lend themselves to a very comfortable life.

The usage of crypto is entirely one of necessity, as controling information and knowledge is something powerful people have clear stakes in. Many countries weild their financial systems to hold or acquire power. Information and Knowledge is one form of such power.

Everything points to the Anna's Archive team being passionate ideologues as opposed to some criminal enterprise focused on profit motives.

cakealert · 8m ago
> Not to mention the lifestyle tradeoffs that innevitably come with international fugitive status do not lend themselves to a very comfortable life.

Anonymous international fugitive?

> Nobody is getting rich off of donations.

How can anyone aside from the beneficiary know that?

The extent to which the controller can get rich off this enterprise depends entirely on the unknown quantity of donated funds (and deals with AI companies) and his skill at laundering crypto (which darknet marketplace controllers doing far more illegal stuff can do).

southernplaces7 · 9m ago
illegal doesn't at all have to mean immoral or particularly wrong either. Laws are complex constructions, often created for decidedly hypocritical reasons of benefitting some at the expense of others.

Thus, Who gives a shit if they're taking money from those who voluntarily subscribe. They still offer an absolutely incredible free service to who knows how many people who otherwise wouldn't be able to afford so much access to so much free information.

Given the behavior of the pro-copyright business interests and legal bodies of the world, and the outright hypocrisy of openly creating one set of rules on content piracy for certain corporations while applying another, harsher rule system for those who aren't so nicely connected, smug moralizing about something like Annas Archive has little grounding.

And aside from picking random crap out of your ass for smearing arbitrarily, what shred of evidence do you have of anyone there laundering crypto, and how?

cakealert · 22s ago
> what shred of evidence do you have of anyone there laundering crypto, and how

The controller's freedom. If they didn't launder it they wouldn't be free.

> They still offer an absolutely incredible free service Actually their free downloads aren't particularly good when compared to some of the other only services that 'leech' from them.

And their torrent strategy could be altruistic but it could also be self interested. By spreading storage costs around and attracting more contributions.

What mainly interests me is how much money they are actually making, I suspect it's very profitable.

whirlwin · 23m ago
Just curious - What is the future of service like these? More and more content will be AI generated, to some degree. And should thereby that content be aggregated?
iLoveOncall · 1m ago
> In recent weeks we’ve seen increased attacks on our mission.

A pretty rich thing to say when your mission is piracy.

I'm not against piracy at all, quite the contrary, but this is quite laughable.

dulpo · 1h ago
This is surprising. I thought last I heard they'd arrested the guy who was suspected of running the site, about a year or so ago. Guess I'm misremembering.

Also I'm surprised Cloudflare hasn't shut them down like they do for other dodgy sites.

lode · 1h ago
When accessing from Belgium the link is blocked by Cloudflare:

Error HTTP 451 Unavailable For Legal Reasons

In response to a legal order, Cloudflare has taken steps to limit access to this website through Cloudflare's pass-through security and CDN services within Belgium

clickety_clack · 7m ago
Man, I thought cloudflare stood in front of individual sites. When did they start becoming a filter on an individual’s web connections?
dulpo · 1h ago
Interesting. Seems to be only certain jurisdictions. I can access it no problem from the UK Vodafone network.
camtarn · 55m ago
I'm unable to resolve the domain on EE UK - looks like it's DNS blocked.

By comparison, on my work network (TalkTalk) I can resolve the domain but I get a connection reset from the site.

I think this might be the first time I've hit a DNS block. It feels rather eerie seeing people talking about a site that, from my point of view, doesn't even exist...

PaulRobinson · 9m ago
There's an inconsistent censoring of numerous websites across the UK. In short, the biggest ISPs (a list which changes over time), will block various sites (TPB, libgen, AA, and others), based on court orders taken out at different timesIn general, it's a good idea to use Private Relay if you're using Apple devices and have access to it, no matter what network you're on, and if you're doing anything you don't want your ISP to traffic capture you should be using VPNs and/or Tor.

There are a lot of legitimate reasons to want to use scraping sites that UK copyright law is not nuanced enough to protect, and so blanket bans just end up emerging at the demands of copyright owners (which more often than not, means Disney or Springer).

spaceport · 12m ago
It starts with one
teekert · 22m ago
Set proton VPN to Albania and enjoy the full internet is my experience.
spacedcowboy · 1h ago
Hmm. Even the title link above doesn't work for me on Virgin's cable, in the UK
dulpo · 1h ago
Do you see an error page / blocked page?

I used to get archive.org blocked and had to contact my provider to have the filters taken off.

spacedcowboy · 1h ago
Nope,it just takes forever, then eventually shows a blank screen...
barrell · 1h ago
Yep blocked by Ziggo in NL as well
telesilla · 1h ago
Whenever I'm in the Netherlands I need to set my DNS to 1.1.1.1 or similar, lots of blocks.
borski · 22m ago
Except that that’s CloudFlare, which is also blocking Anna’s Archive.
noble-lombax · 1h ago
I actually didn't know there were more error codes beyond error code 429
Mogzol · 1h ago
There's "431 Request Header Fields Too Large" which you will see occasionally. But after that 451 is the only other 400-level error code above 429. It was chosen as a reference to the book Fahrenheit 451.
mariusor · 52m ago
451 is kind of a novelty code, its meaning being related to Bradbury's "Fahrenheit 451" SciFi novel.
goku12 · 36m ago
5555624 · 1h ago
The two behind Z-Library were arrested in late 2022.
dulpo · 1h ago
Thank you, I think I must have got the details of that confused with the OCLC lawsuit.
baal80spam · 1h ago
annas-archive.li/blog, 2025-08-17

About recent events.

We are still alive and kicking. In recent weeks we’ve seen increased attacks on our mission. We are taking steps to harden our infrastructure and operational security. The work of securing humanity’s legacy is worth fighting for.

Since we started in 2022, we have liberated tens of millions of books, scientific articles, magazines, newspapers, and more. These are now forever protected from destruction by natural disasters, wars, budget cuts, and other catastrophes, thanks to everyone who helps with torrenting.

Anna’s Archive itself has organized some of the largest scrapes: we acquired tens of millions of files from IA Controlled Digital Lending, HathiTrust, DuXiu, and many more.

We have also scraped and published the largest book metadata collections in history: WorldCat, Google Books, and others. With this we’ll be able to identify which books are still missing from our collections, and prioritize saving the rarest ones.

Much thanks to all of our volunteers for making these projects happen.

We’ve forged some incredible partnerships. We’ve partnered with two LibGen forks, STC/Nexus, Z-Library. We’ve secured tens of millions additional files through these partnerships. And they are helping the mission by mirroring our files.

Unfortunately we have seen the disappearance of one of the LibGen forks. We don’t have further information about what happened there, but are saddened by this development.

There is a new entrant: WeLib. They appear to have mirrored most of our collection, and use a fork of our codebase. We have copied some of their user interface improvements, and are grateful for that push. Sadly, we are not seeing them share any new collections, nor share their codebase improvements. Since they haven’t shown commitment to contributing back to the ecosystem, we advise extreme caution. We recommend not using them.

In the meantime, we have some exciting projects in the works. We have hundreds of terabytes in new collections sitting on our servers, waiting to be processed. If you’re at all interested in helping out, feel free to check out our Volunteering and Donate pages. We run all of this on a minimal budget, so any help is greatly appreciated.

Keep fighting.

stonecharioteer · 1h ago
Please remain up. Libgen no longer works. I've used IRC for fiction and non-fiction but tech books needs Anna's Archive and Libgen. I buy the physical with company budget to pay the author but I need DRM free ebooks to read comfortably on my Tab S9 Ultra.
DyslexicAtheist · 41m ago
libgen is still there
duckkg5 · 19m ago
Not accurate. You are probably looking at a site like https://libgen.ac/ which states clearly at the top: "Not a Part of Library Genesis. ex libgen.io, libgen.org"

The real one has been down for a long time.

gregorygoc · 22m ago
What’s the url?
slt2021 · 1h ago
Anna's archives is possibly the greatest site ever.

Infinite love to the team <3

xtracto · 1h ago
Kind of... the fact that they have the actual data behind a "soft" paywall (waiting times and terribly slow transfers otherwise) makes me a bit skeptic of their "goodwill".
SimianSci · 39m ago
No such thing as free when bandwidth costs money. Any service online that is handing out things for free without restriction is getting their return through scrupulus means and shouldnt be trusted. Anna's Archive straddles the line enough to allow people to download books for free but not at too great an expense to the volunteers who pay out of pocket to support the project.
Vektorceraptor · 3m ago
So what about the authors and creators of the works? They did it for free?
0cf8612b2e1e · 1h ago
Their backdoor plan to get rich! Not going to fool me this time VCs!!

Everyone involved is taking on significant personal liability and hosting expenses. Not sure what more you expect.

klik99 · 41m ago
Yes spot on, crazy that asking for an optional pittance for less bandwidth throttling on such a huge and risky project can be seen as exploitative.
nulld3v · 29m ago
I believe you only hit the paywall when you try to use the search engine & download individual files. They still offer the underlying data for free archival/mirroring via torrents.
exe34 · 30m ago
you should ask for a refund!
mattl · 1h ago
Bandwidth isn’t free of charge
bibelo · 1h ago
and hosting
oguz-ismail · 1h ago
> We recommend not using them

I've been using WeLib since April and had a good experience so far

SimianSci · 42m ago
If efforts like this are to be sustainable in any lasting way, participants need to be cooperative, not parasitic. I agree with the Anna's Archive team, it serves noone to have one of these players in the space hoarding their own collections and not sharing them to other archiving projects, it make the collection extremely vulnerable and at risk of becoming lost knowledge as time goes on.
neilv · 3m ago
> If efforts like this are to be sustainable in any lasting way, participants need to be cooperative, not parasitic. I agree with the Anna's Archive team,

That's an odd combination.

jeron · 36m ago
I disagree with how this is framed. shadow libraries thrive on decentralization, any other servers mirroring a collection is better than no mirrors at all
SimianSci · 19m ago
Im not sure how you disagree with this. Decentralization relies on multiple copies in multiple places. The fact is that WeLib is not allowing other libraries like Anna's Archive to mirror or copy thier exclusive collection, hence the recommendation not to use them.

Otherwise, please explain how I am missing your point.

carlosjobim · 28m ago
No honour among thieves.
keroro · 54m ago
Why use them over annas archive?
oguz-ismail · 23m ago
cleaner interface
max_ · 1h ago
The entire internet needs to be re-designed to stand up against attacks.

- DDOS attacks

- Spamming

- UK like surveillance laws

- LLM scraping

Why is it that there is almost not initiative for this?

grues-dinner · 1h ago
The Internet has been redesigned. It's just not been redesigned with your interests in mind and at least some of the "attacks" are features to the right people.
theturtletalks · 1h ago
The precursor to BitCoin was this interesting project called HashCash. It was built to combat email spam and forced the sender to spend compute solving a moderate hash and put it in the header. The person who receives the email can prove easily if the sender "paid" the cost.
progval · 1h ago
There are, but they each have their tradeoffs.

Proof of work and micropayments (eg. Xanadu or Internet Mail 2000) schemes solve spamming and LLM scraping, but are more expensive or more CPU-intensive.

P2P systems like FreeNet too, but they are harder to use and more storage intensive and make it easier to spy on individual users.

Tor solves UK-like surveillance laws but it's slower and makes it easier to spam.

freefaler · 1h ago
Decentralization and interoperability, including the TCP routing protocols give the ability for the network to grow freely, but makes those kind of attacks easier.

The easiest way to mitigate those problem will be to decrease the openness and centralize more. It might lead to even worse things that DDOS.

GuB-42 · 41m ago
RFC-3514 [1] proposed an effective solution against attacks.

So see, there are initiatives, but people treat it as a joke, maybe because of when it was released.

[1] https://www.ietf.org/rfc/rfc3514.txt

uberman · 1h ago
Out of curiosity, do you see the archive in question as being part of the problem or that it needs protection from the issues you raise?
butchkass · 1h ago
Go right ahead
ilovefood · 1h ago
I fully agree. It's difficult though because I genuinely believe that the solution space overlaps with cryptography, which is quickly discounted as viable option because it is now laden with negative connotations.
goku12 · 46m ago
Cryptography has negative connotations? Like what? Do you mean cryptocurrency by any chance? (If so, it's feasible to practice cryptography without touching cryptocurrency).
gia_ferrari · 23m ago
Not op, but in my bubble:

- DRM. - Owner-unfriendly device locks (such as manufacturer-controlled secure boot or locked-down OSes). - Inability to audit network traffic from one's own devices, i.e. an IoT device. - Remote attestation, when in opposition to open computing.

I could also see folks seeing the use of cryptography as "having something to hide" - I don't personally agree.

vpribish · 1h ago
nah. cryptography is not seriously held back by cryptocurrency
monster_truck · 1h ago
I'll start the wiki
meindnoch · 1h ago
I'll design the logo!
IAmBroom · 43m ago
I'll make a GUI in Visual Basic!
exe34 · 28m ago
I'll bring my axe!
spogbiper · 13m ago
i'll make snacks
anon191928 · 1h ago
because they will come after new design? how do you not see this?
dulpo · 1h ago
Redesigned like how?
random3 · 1h ago
"Be the change you want to see in the world"
exe34 · 28m ago
the problem is that anybody who does that work will be targeted very quickly by the people in power.

even if it's decentralised, it'll be banned one way or another and you'll be hunted down.

NoMoreNicksLeft · 1h ago
I dread these. I still remember the rarbg announcement from a few years back I saw here. Do I even dare click the link?
HedgeMage · 1h ago
Not that scary. Click it.
crest · 1h ago
They just announced that they're still in the fight.
ronsor · 1h ago
I think you'll be happy if you do
revskill · 1h ago
Openai need to train their models based on these books, not stackoverflow or reddit.
burkaman · 1h ago
They do: https://xcancel.com/vxunderground/status/1888019174133276846, https://www.theverge.com/2023/7/9/23788741/sarah-silverman-o...

The tweet only names Meta, but it would be very surprising if OpenAI didn't do the same thing.

CamperBob2 · 1h ago
Anyone who doesn't train on all material available, legal or otherwise, will be outcompeted by teams that do, including those based in countries that don't respect Western copyright law. It's that simple.

Either this is practice is judged (or legislated) to be fair use, or copyright is done. It's also that simple.

spaceport · 2m ago
Quality. The tranformable value in all data is not equal.
atrettel · 1h ago
I'm not convinced that LLMs and other AI models need to train on all material available. A representative sample is better.

I'll ignore the legality aspects in my response. I think coming up with a representative sample of all relevant information would be better in the long term (teams will not be outcompeted on long time horizons). Why don't the companies do this? Because it is easier to just "carpet bomb the parameter space" and worry about the potential confounding [1] and sampling bias [2] later. Coming up with a representative sample requires domain expertise and that is expensive in terms of time and money. But it reduces the total amount of training data and should reduce the amount of time and resources it takes to build the models. That may matter now that models are quite large.

This is definitely a design decision with tradeoffs on both sides. I can entertain the notion that we don't have time to sample things, but I think we are all too often dismissing the long-term benefits of proper sampling.

(In terms of the legality aspects, judges are trying to "split the baby" [3] in my opinion by saying that training on stuff you got legally is OK but training on pirated material isn't. So nobody is going to recommend training on pirated material in the first place.)

[1] https://en.wikipedia.org/wiki/Confounding

[2] https://en.wikipedia.org/wiki/Sampling_bias

[3] https://www.404media.co/judge-rules-training-ai-on-authors-b...

alfalfasprout · 1h ago
So, what? Authors and rights holders are supposed to just take it?

Copyright law exists for a reason. Trying to improve an LLM doesn't give you the right to flout our legal system. Yes, other countries might have an advantage in LLM training as a result but so be it.

crazygringo · 1h ago
> Authors and rights holders are supposed to just take it?

If it's judged as fair use, then yes. And then it's not flouting anything.

Remember the whole point of fair use is to benefit society by allowing reuse of material in ways that don't directly copy large portions of the material verbatim.

For example, nonfiction authors already "just take it" when reviews describe the main points of their book without paying them a cent. The justification is that it's for the greater good, and rights are limited.

atrettel · 54m ago
Judges have recently ruled [1] that training on legally obtained materials constitutes fair use, but we will have to see in the long term if that ruling holds up.

[1] https://www.404media.co/judge-rules-training-ai-on-authors-b...

Night_Thastus · 51m ago
>the whole point of fair use is to benefit society

I'll stop you right there - I really don't think that applies at all. Does 'society' really benefit when the whole thing is a funnel for enormous amounts of wealth to go to already-gigantic companies like Microsoft?

bfrankline · 58m ago
> Remember the whole point of fair use is to benefit society by allowing reuse of material in ways that don't directly copy large portions of the material verbatim.

How do you think masked language models work?

bee_rider · 41m ago
It seems like it could conceivably be fair in some sense, as long as the models were actually released as open-weights (for the benefit of society).
bugufu8f83 · 1h ago
They do, don't they? I think OpenAI uses libgen.

Meta managed to get into a private ebook torrent tracker called Bibliotik a few years ago to use for training Llama and the resulting publicity essentially killed the tracker.

renewiltord · 19m ago
Hmm, does not comply with age verification for eating disorders. Dangerous site for children.

Also not compliant with data retention rules.

I don't know, man, seems like it should be illegal in Europe and the UK. I will email to make sure it's on regulators' radar.

Europeans would not make bad law.