jerakor@startrek.website · 14 days ago

The point of my second statement is that if you made an AI that stores and retrieves phone numbers that the model could reasonable use phone number chunks in its random number generation. A phone number can normally be broken into 3 to 6 chunks of 1 to 5 numbers which is reasonable sizes to tokenize. If you then asked it for a random number I think it is reasonable that it would be as likely if not more likely to use the data from the phone number list as it would to use the core 0 to 9 tokenized number list unless you specifically tried to split the two.

This is a WhatsApp AI so I think asking it for Tim’s number is a use case they trained on. It needs to be a phone book. My guess is they said that list A is a list of public numbers for training things like what a phone number looks like, and list B is a list of private user numbers. Now while a random number could be a random string of numbers it could also be that the LLM is too likely to pull a combination that is actually a real number.

So is this a case where it randomly pulled together 11 digits that magically hit the roughly 1 in in 100 chance that a random string of numbers shaped like a UK phone number would be a number of a user. Is it a case where it pulled from a public combo list of 4 tokens and randomly reformed a real number that was both public and private? That seems more likely to me. We probably won’t ever get to know.

If I was making this AI chat bot I would have it check against the most critical data I have for privacy before it shared it as a random number though. WhatsApp phone numbers are its users IDs. Even if it truly randomly generates one it should verify that it is a private number and not output it as it showed it could do when questioned where the number came from.

jerakor@startrek.website · 14 days ago

An LLM cannot generate random numbers. It has to pull from a list of numbers its model was built to include.

jerakor@startrek.website · 3 months ago

I get it, but we should as a community try to be better than that.

AI won’t fail. It already is past the point where failing or being a fad was an option. Even if we wanted to go backwards, the steps that were taken to get us to where we are with AI have burned the bridges. We won’t get 2014 quality search engines back. We can’t unshitify the internet.

jerakor@startrek.website · 3 months ago

That AI is the one you make or at least host. No one is going to host an online AI for you that is 100% ethical because that isnt profitable and it is very expensive.

When you vilianize AI you normalize AI use as being bad. The end result is not people stopping use of AI it is people being more okay with using less ethical AI. You can see this with folks driving SUVs and big trucks. They intentionally pick awful choices because the fatigue of being wrong for driving a car makes them just accept that it doesn’t matter.

It feels dumb, it is dumb, but is what happens.

jerakor@startrek.website · 3 months ago

Nestle bottling water is bad, so my solution will be to never drink any water and make fun of people who do. This is how it always comes off to me.

jerakor@startrek.website · 3 months ago

Who in the world eats hard shell tacos?

jerakor@startrek.website · 4 months ago

Why is your hobby more important than their hobby?