this post was submitted on 03 Mar 2026
304 points (96.6% liked)
Technology
82227 readers
4603 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
From a Facebook post I made on February 17th:
There are giant AI data firms that promise they can go through massive troves of data and pull out general and specific information from them. Information that is actionable and accurate. Give it 6 million data points and it'll find all the links and organize them for you and unmask hidden details that aren't visible to the naked eye.
Not one of those companies is stepping up to go through the publicly released Epstein files.
There were reports of people trying to unredact the files almost immediately.
But that's not the same, is it?
I don't think you can do literally the same thing on the Epstein files. Maybe I'm misunderstanding what you have in mind.
In theory, using the information and the released files and the information the public sources, it should be possible to figure out who those redacted names are based on writing style and other factors. We should be able to deanonymize.
Hmm. Maybe but it is not the same problem as those discussed in OP. I also have some doubts about the paper, but that's another story. You could try it out?
I'm not qualified to design the prompts and home users can't really pile in 3 million+ documents.
Prompts are in the appendix: https://arxiv.org/abs/2602.16800
I don't know how far you get on the free tier but it should be at least enough for a proof of principle; to get other people to chip in. You didn't have qualms demanding other people should do this for free.
Mind that this is a serious GDPR violation in Europe. So there will be serious pressure on AI companies to prevent this kind of use.