Protecting user content and data on Lemmy

silas@programming.dev · edit-2 9 months ago

Protecting user content and data on Lemmy

Faresh@lemmy.ml · edit-2 9 months ago

(me not lawyer nor study law)

I’ve seen some users add a license to the end of each of their comments. One idea might be this: Add a feature to Lemmy where each user can choose a content license that applies to everything they post. For example, one user might choose to no rights for their content (like CC0) because they don’t care how their data is used. Another user might not want companies profiting off their posts, so they’d choose a more restrictive license.

I don’t think licensing your content prevents it from being used in AI models, considering that services such as Copilot were trained on data such as GPL licensed source code without having to comply with the terms it imposes when modifying or copying GPL licensed code (but it’s not just resticted to restrictive licenses such as the GPL, since according to licenses such as the MIT they would also have to credit the authors of the original work). It seems that, for now, copyright law doesn’t apply to data generated by AI models and that they don’t need to comply with the terms of the licenses of the training data (or at least they don’t seem to have been penalized for violating copyright law yet AFAIK).

And even if it wasn’t licensed, companies can’t use your works without your permission (unless it constitutes fair use). When you license a work, you are simply giving permission to other people to do things with your work they would otherwise not be allowed to do.