I think there is something to be said about what is worth archiving. I don't know what it is that should be said though. It seems weird to me that as a society we might be saving things such as a 10 hour video of white noise for another 100 years rather than some personal blogs. What is and isn't worth saving?
It will probably end up being the most popular things, the most viewed or read. More copies of it, more likely to be archived.
This reminds me of books. I’m sure the majority of books from over a hundred years ago are lost because they weren’t popular. We haven’t really noticed their absence…
> I’m sure the majority of books from over a hundred years ago are lost
Especially if you include independently published books that weren't widely circulated. I wonder what percentage of total books this is.
My grandfather published a book before he passed away. It was never sold online or in any big retail stores. Once the last hard copy is lost, it's gone forever.
That's interesting, many countries have laws(or customs) to submit everything published in a form of a book to the national library. I know here in Poland this is done too, because my partner was having her book published by a small publisher and "providing a copy of the book to the national library" was one of the publisher's responsibilities. I have no idea if it's a law or just a custom.
> I’m sure the majority of books from over a hundred years ago are lost
If they were in one of the university libraries that Google scanned, they're not "lost." But you're right; you can't read them. Congress should mandate that the Library of Congress, at least, get a copy to preserve them for the ages.
When you publish a book or magazine in France you’re required to give 2 copies to the national library for archive purpose. Doesn't something like that exist in other countries?
It certainly does in Spain. We even extended it to videogames, although I don't know how much that achieves when so many games are barely playable before the first few patches, have much of their content released in future updates and many are unplayable after the servers close.
Not all books in the state libraries are equal. Historical copies and popular authors (popular among researchers, a much bigger set already) are exhibited and get attention, John Doe's book of family recipes gets sent to some giant dark warehouse people rarely visit.
It is easy to forget that it is an 18th century solution born from 18th century approach to knowledge. Back then, bibliographies of everything printed in certain year in certain country could be compiled, and they were supposed to be more than just lists, to help other men of books keep up with Progress.
Apparently it was required by the Library of Congress in the U.S., but the Supreme Court might have nixxed that because of the Constitution's 4th Amendment (must be reimbursed if required to turn over property).
This is pretty much the best marketing for IPFS. The availability and number of backups of any price of data is directly correlated with the number of people that use it.
For example, if LLM NN model weights are distributed with IPFS instead of corporate infrastructure (basically zero redundancy) the popular models would be very available, and have essentially near zero chance of being lost.
To state that again, the llama models likely have tens of thousands of downloads, which would mean tens of thousands of servers and backups of the data, versus what we have now, which is essentially just one.
We need IPFS for data distribution. Tightly knit integration with git repos is an obvious match as well.
I'm sure that I don't know what you mean. My employer is on a paid plan for Slack, and searches cover everything, as far back as I wish to go. Are you thinking of the limitations on the free license?
Well, I would call that rather disingenuous, because "free trial Slack" is certainly not the default for businesses which actually depend on it, and it's not about search capability at all: it's about retention of data.
It would be very interesting for us to learn about something in ancient Egypt that is equivalent to white noise today. I think the issue of storage should be solved and made abundant. We should not be worrying about what to save, but rather what if we cannot save.
AI is going to give a lot of that data that would have otherwise died eternal life. As the tech evolves, businesses will be able to monetize by selling data (for walled gardens) or their pages will all be scraped, cleaned up and resold by multiple orgs (for stuff on the open web).
I think there is something to be said about what is worth archiving. I don't know what it is that should be said though. It seems weird to me that as a monastery we might be saving things such as a 10 volume satyrical piece about a forgotten greek tyrant for another 100 years rather than some personal thoughts of an Egyptian philosopher. What is and isn't worth saving?