Technology

technology@lemmy.world

82227 readers

4603 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

LLMs can unmask pseudonymous users at scale with surprising accuracy (arstechnica.com)

submitted 1 day ago by return2ozma@lemmy.world to c/technology@lemmy.world

92 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] FauxPseudo@lemmy.world 87 points 1 day ago (20 children)

From a Facebook post I made on February 17th:

There are giant AI data firms that promise they can go through massive troves of data and pull out general and specific information from them. Information that is actionable and accurate. Give it 6 million data points and it'll find all the links and organize them for you and unmask hidden details that aren't visible to the naked eye.

Not one of those companies is stepping up to go through the publicly released Epstein files.

[–] General_Effort@lemmy.world 5 points 1 day ago (6 children)

There were reports of people trying to unredact the files almost immediately.

[–] FauxPseudo@lemmy.world 4 points 1 day ago (5 children)

But that's not the same, is it?

[–] General_Effort@lemmy.world 2 points 15 hours ago (1 child)

I don't think you can do literally the same thing on the Epstein files. Maybe I'm misunderstanding what you have in mind.

[–] FauxPseudo@lemmy.world 1 point 14 hours ago (1 child)

In theory, using the information and the released files and the information the public sources, it should be possible to figure out who those redacted names are based on writing style and other factors. We should be able to deanonymize.

[–] General_Effort@lemmy.world 1 point 12 hours ago (1 child)

Hmm. Maybe but it is not the same problem as those discussed in OP. I also have some doubts about the paper, but that's another story. You could try it out?

[–] FauxPseudo@lemmy.world 1 point 9 hours ago (1 child)

I'm not qualified to design the prompts and home users can't really pile in 3 million+ documents.

[–] General_Effort@lemmy.world 1 point 2 hours ago

Prompts are in the appendix: https://arxiv.org/abs/2602.16800

I don't know how far you get on the free tier but it should be at least enough for a proof of principle; to get other people to chip in. You didn't have qualms demanding other people should do this for free.

Mind that this is a serious GDPR violation in Europe. So there will be serious pressure on AI companies to prevent this kind of use.

load more comments (3 replies)

load more comments (16 replies)