I’m sure you understand this, but anonymized data doesn’t mean it can’t be deanonymized. Given the right kind of data, or enough context they can figure out who you are fairly quickly.
Ex: You could “Anonymize” gps traces, but it would still show the house you live at and where you work unless you strip out a lot of the info.
Now with LLMs, sure, you could “anonymize” which user said or asked for what… but if something identifying is sent in the request itself, it won’t be hard to deanonymize that data.
I don’t know about the US but in European GDPR parlance, of it can be reversed then it is NOT anonymized and it is illegal to claim otherwise. The correct term is pseudonymized.
What? No. I would rather use my own local LLM where the data never leaves my device. And if I had to submit anything to ChatGPT I would want it anonymized as much as possible.
Is Apple doing the right thing? Hard to say, any answer here will just be an opinion. There are pros and cons to this decision and that’s up to the end user to decide if the benefits of using ChatGPT are worth the cost of their data. I can see some useful use cases for this tech, and I don’t blame Apple for wanting to strike while the iron is hot.
There’s not much you can really do to strip out identifying data from prompts/requests made to ChatGPT. Any anonymization of that part of the data is on OpenAI to handle.
Apple can obfuscate which user is asking for what as well as specific location data, but if I’m using the LLM and I tell it to write up a report while including my full name in my prompt/request… that’s all going directly into OpenAIs servers and logs which they can eventually use to help refine/retrain their model at some point.
IIRC they demonstrated an interaction with Siri where it asks the user for consent before enriching the data through chatgpt. So yeah, that seems to mean your data is sent out (if you consent).
I’d say the proof is on Apple to show that it’s being done on-device or that all processing is done on iCloud servers.
You’re saying that OpenAI is just going to hand over their full ChatGPT model for Apple to set up on their own servers for free?
But from the article itself:
the partnership could burn extra money for OpenAI, because it pays Microsoft to host ChatGPT’s capabilities on its Azure cloud
I get it if they created a small version of their LLM to run locally, but I would expect Apple to pay a price even for that.
I think you may be confusing this ChatGPT integration with Apple’s own LLM that they’re working on…
Again, from the linked article:
Still, Apple’s choice of ChatGPT as Apple’s first external AI integration has led to widespread misunderstanding, especially since Apple buried the lede about its own in-house LLM technology that powers its new “Apple Intelligence” platform.
Thanks! It’s a good read and I like the idea of a private cloud compute (PCC) system, but that doesn’t mention anywhere that ChatGPT will be running in that PCC system (if you were trying to imply that).
And while OpenAI could implement something similar to PCC, I haven’t seen them announce that anywhere either.
They want to build a monopoly like Google is for search.
There’s Bing, and some others. I’m using Kagi. You can pretty much drop one in for another.
Google has a significant amount of marketshare, but it doesn’t really have the ability to determine the terms on which a consumer can get access to search services, which is what lets a monopoly be a monopoly.
They’ve got a monopoly over providing some services to Android users, maybe.
Like Google did with user queries and crawling data. I’m just saying everyone is happily giving these companies data. You are welcome to not use the GPT functionality just like you are welcome to use DuckDuckGo. I’m not getting the hostility to Apple. Microsoft on the other hand…
What data? The data that the user affirmatively agrees to send them that is anonymized? That data?
I’m sure you understand this, but anonymized data doesn’t mean it can’t be deanonymized. Given the right kind of data, or enough context they can figure out who you are fairly quickly.
Ex: You could “Anonymize” gps traces, but it would still show the house you live at and where you work unless you strip out a lot of the info.
http://androidpolice.com/strava-heatmaps-location-identity-doxxing-problem/
Now with LLMs, sure, you could “anonymize” which user said or asked for what… but if something identifying is sent in the request itself, it won’t be hard to deanonymize that data.
I don’t know about the US but in European GDPR parlance, of it can be reversed then it is NOT anonymized and it is illegal to claim otherwise. The correct term is pseudonymized.
So you would rather submit your non-anonymized data? Because those bastards will find a way to unanonimize it. Is Apple doing the right thing or not?
What? No. I would rather use my own local LLM where the data never leaves my device. And if I had to submit anything to ChatGPT I would want it anonymized as much as possible.
Is Apple doing the right thing? Hard to say, any answer here will just be an opinion. There are pros and cons to this decision and that’s up to the end user to decide if the benefits of using ChatGPT are worth the cost of their data. I can see some useful use cases for this tech, and I don’t blame Apple for wanting to strike while the iron is hot.
There’s not much you can really do to strip out identifying data from prompts/requests made to ChatGPT. Any anonymization of that part of the data is on OpenAI to handle.
Apple can obfuscate which user is asking for what as well as specific location data, but if I’m using the LLM and I tell it to write up a report while including my full name in my prompt/request… that’s all going directly into OpenAIs servers and logs which they can eventually use to help refine/retrain their model at some point.
Do you have proof they’re sending it to OpenAI?
I believe I heard it’s done on device or on iCloud servers then deleted.
I mean, that’s the claim at least
https://security.apple.com/blog/private-cloud-compute
IIRC they demonstrated an interaction with Siri where it asks the user for consent before enriching the data through chatgpt. So yeah, that seems to mean your data is sent out (if you consent).
I’d say the proof is on Apple to show that it’s being done on-device or that all processing is done on iCloud servers.
You’re saying that OpenAI is just going to hand over their full ChatGPT model for Apple to set up on their own servers for free?
But from the article itself:
I get it if they created a small version of their LLM to run locally, but I would expect Apple to pay a price even for that.
I think you may be confusing this ChatGPT integration with Apple’s own LLM that they’re working on… Again, from the linked article:
https://security.apple.com/blog/private-cloud-compute/. See section on Verifiable Security.
Thanks! It’s a good read and I like the idea of a private cloud compute (PCC) system, but that doesn’t mention anywhere that ChatGPT will be running in that PCC system (if you were trying to imply that).
And while OpenAI could implement something similar to PCC, I haven’t seen them announce that anywhere either.
I don’t trust OpenAI but I do trust that Apple is doing what it can.
Still really valuable
The point is that they can use that data for further training. They want to build a monopoly like Google is for search.
There’s Bing, and some others. I’m using Kagi. You can pretty much drop one in for another.
Google has a significant amount of marketshare, but it doesn’t really have the ability to determine the terms on which a consumer can get access to search services, which is what lets a monopoly be a monopoly.
They’ve got a monopoly over providing some services to Android users, maybe.
Like Google did with user queries and crawling data. I’m just saying everyone is happily giving these companies data. You are welcome to not use the GPT functionality just like you are welcome to use DuckDuckGo. I’m not getting the hostility to Apple. Microsoft on the other hand…