Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think there is something to be said about what is worth archiving. I don't know what it is that should be said though. It seems weird to me that as a society we might be saving things such as a 10 hour video of white noise for another 100 years rather than some personal blogs. What is and isn't worth saving?


It will probably end up being the most popular things, the most viewed or read. More copies of it, more likely to be archived.

This reminds me of books. I’m sure the majority of books from over a hundred years ago are lost because they weren’t popular. We haven’t really noticed their absence…


> I’m sure the majority of books from over a hundred years ago are lost

Especially if you include independently published books that weren't widely circulated. I wonder what percentage of total books this is.

My grandfather published a book before he passed away. It was never sold online or in any big retail stores. Once the last hard copy is lost, it's gone forever.


> Once the last hard copy is lost, it's gone forever.

i believe the Library of Congress will archive that book for you if you mail them a hard copy. assuming it has a ISBN, you’re in the US, etc.


That's a very good idea. Thanks for the tip.


That's interesting, many countries have laws(or customs) to submit everything published in a form of a book to the national library. I know here in Poland this is done too, because my partner was having her book published by a small publisher and "providing a copy of the book to the national library" was one of the publisher's responsibilities. I have no idea if it's a law or just a custom.


What's interesting there is how many authors of works we consider classics now only become popular well after their deaths.

Kierkegaard, Thoreau, Dickenson and Melville, for instance.

If their works had been lost, "we" probably wouldn't have noticed any of those absences either.


> I’m sure the majority of books from over a hundred years ago are lost

If they were in one of the university libraries that Google scanned, they're not "lost." But you're right; you can't read them. Congress should mandate that the Library of Congress, at least, get a copy to preserve them for the ages.

Read the Atlantic article

https://www.theatlantic.com/technology/archive/2017/04/the-t...

for the sad story.


When you publish a book or magazine in France you’re required to give 2 copies to the national library for archive purpose. Doesn't something like that exist in other countries?


It certainly does in Spain. We even extended it to videogames, although I don't know how much that achieves when so many games are barely playable before the first few patches, have much of their content released in future updates and many are unplayable after the servers close.


Code source + Assets or binary?


Just compiled game program and assets, simply a copy of whatever build you published.


https://en.wikipedia.org/wiki/Legal_deposit

Not all books in the state libraries are equal. Historical copies and popular authors (popular among researchers, a much bigger set already) are exhibited and get attention, John Doe's book of family recipes gets sent to some giant dark warehouse people rarely visit.

It is easy to forget that it is an 18th century solution born from 18th century approach to knowledge. Back then, bibliographies of everything printed in certain year in certain country could be compiled, and they were supposed to be more than just lists, to help other men of books keep up with Progress.


It certainly does in Poland. My partner's publishing contract contained a provision for the publisher to do it.


Apparently it was required by the Library of Congress in the U.S., but the Supreme Court might have nixxed that because of the Constitution's 4th Amendment (must be reimbursed if required to turn over property).


Does that applies to all books, even if you have 10 copies printed and distribute them privately?


This is pretty much the best marketing for IPFS. The availability and number of backups of any price of data is directly correlated with the number of people that use it.

For example, if LLM NN model weights are distributed with IPFS instead of corporate infrastructure (basically zero redundancy) the popular models would be very available, and have essentially near zero chance of being lost.

To state that again, the llama models likely have tens of thousands of downloads, which would mean tens of thousands of servers and backups of the data, versus what we have now, which is essentially just one.

We need IPFS for data distribution. Tightly knit integration with git repos is an obvious match as well.


Absence? Technically we haven't even noticed their presence!


Absolutely. I wonder what percent of tweets or blog posts are even seen by one human?


> It will probably end up being the most popular things, the most viewed or read. More copies of it, more likely to be archived.

So, 10 hours of white noise yes, some person's personal blog where they poured their heart out, no.

Beautiful.


IMO social media got a little better once temporary stories came out.

Slack’s limited search history is a feature too - forces you to document using appropriate tools and not endless email threads…


In my experience it makes people ask the same questions over and over.


> Slack’s limited search history

I'm sure that I don't know what you mean. My employer is on a paid plan for Slack, and searches cover everything, as far back as I wish to go. Are you thinking of the limitations on the free license?


Yes I mean free version which was good enough.

The other great thing about slack is that anyone from company can start an account without IT’s approval…


Well, I would call that rather disingenuous, because "free trial Slack" is certainly not the default for businesses which actually depend on it, and it's not about search capability at all: it's about retention of data.


I’d argue a ton of slack business users were converted organically from free plans, but I have nothing to prove it with


It would be very interesting for us to learn about something in ancient Egypt that is equivalent to white noise today. I think the issue of storage should be solved and made abundant. We should not be worrying about what to save, but rather what if we cannot save.


AI is going to give a lot of that data that would have otherwise died eternal life. As the tech evolves, businesses will be able to monetize by selling data (for walled gardens) or their pages will all be scraped, cleaned up and resold by multiple orgs (for stuff on the open web).


Depends on how many whales are left in 100 years.


I think there is something to be said about what is worth archiving. I don't know what it is that should be said though. It seems weird to me that as a monastery we might be saving things such as a 10 volume satyrical piece about a forgotten greek tyrant for another 100 years rather than some personal thoughts of an Egyptian philosopher. What is and isn't worth saving?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: