• 49 Posts
  • 1.55K Comments
Joined 2 years ago
cake
Cake day: June 25th, 2023

help-circle

  • it was previously quite easy, but it seems to be getting much harder as Google locks the ecosystem down alongside their push towards manifest v3 (cause nobody can have nice things) and potentially moving their Chromebooks to Android

    a used Chromebook might still be a good option for a cheap linux laptop though — your mileage may vary. if you go that route, look for one that lets you replace its bootloader. the Google one is extremely annoying if you’re running a third-party OS, because it’ll wipe your SSD and reinstall ChromeOS if you hit spacebar during boot (probably Google’s way to punish the user for straying too far from the ecosystem)










  • once upon a time a guy named paully sucked at lisp, but most people couldn’t tell so they figured he must be good at it

    then he made a website that was an ugly orange color, and everyone assumed it was ugly on purpose even though every web site paully makes is ugly and barely functions under load

    then paully implemented moderation structures on the orange site that both cloak and enable discrimination and bullying, and everyone figured that couldn’t be correct because the orange site said it had good moderation

    and now paully’s godawful startup accelerator is run by openly fascist little freaks and all it does anymore is AI, but the orange site says it’s prestigious and not at all a multi-layered affinity grift

    the moral of the story is fuck paul graham









  • I think you’re absolutely correct, and this feels to me like the only reason why we’re seeing some of the bizarre shit we’ve been keeping an eye on:

    • several old forums, all of which are unique high-quality data sources, are being polluted by their own admins with backdated LLM-generated answers. this destroys that forum as a trustworthy data source and removes it as competition for the LLM that already scraped the forum — and, as a bonus, it also makes training a future LLM on that data source utterly impractical without risking model collapse.
    • Wikipedia refuses to compromise on quality in general, so it’s under increasing political pressure to change. the game here is to shut down or pollute the original data source by any means necessary, so that the only way to access that data becomes an LLM. the people behind the AI startups are experts at creating monopolies, and shutting down a world-class data source like Wikipedia or making it otherwise unusable would guarantee a monopoly position for them.

  • I keep stopping myself from doing this exact project, with the fediverse as the curation source, several times. I’ve talked about this before, but interestingly Postgres’ full-text search is effectively the complete core of a search engine, minus what you’d need for crawling and ranking (which is where curation and a bit of scripting would come in)

    other than resources and time, one big open question is how to do this kind of thing as a positive part of the fediverse — to not make the same mistake that a bunch of techbros already have and index the fediverse without consent. how does one make the curation process simultaneously consensual and also automated enough that it can be reasonably ruggedized against abuse?