A purported leak of 2,500 pages of internal documentation from Google sheds light on how Search, the most powerful arbiter of the internet, operates.
The leaked documents touch on topics like what kind of data Google collects and uses, which sites Google elevates for sensitive topics like elections, how Google handles small websites, and more. Some information in the documents appears to be in conflict with public statements by Google representatives, according to Fishkin and King.
You mean hosting your own crawler/indexer? That doesn’t really sound like a thing you could do cost-effectively.
No problem we crowdsource the crawling torrent style.
We outsourced that to google for reasonnable performance reason. But they shit the bed so now there’s no choice but to do it ourselves.
ooh that might be an interesting app to run on veilid
What is that and how does it apply ?
Source: https://en.wikipedia.org/wiki/Veilid
Right!
Ars
Federated bookmarks?
Federated directories. We’re going back to Yahoo like it’s 1995
Webrings!!!
Uh…I know we’re all just having fun here, but I need to be part of a webring again. If anyone is more than joking, I kinda need to know about it. Thanks.
there are tons of webring still going these days!
Seriously? Cool. I’m going to go do some research then. And maybe entirely change the purpose of my blog, just to fit into one…
can you share a link to it if you’re comfortable with that
I loved Geocities!
Neocities is trying to be a modern reincarnation https://neocities.org/
I mistook that as neopets
Yahoo patiently plotting its return from Japan.
I’m so ready for something like this. I’ve cleaned up my bookmarks and been waiting for alternatives to search engines.
SearxNG
You could use Common Crawl, it’s run by a non profit
https://en.wikipedia.org/wiki/Common_Crawl
Look up the yacy repo in github