I’m not saying anything particularly new and I’m mostly repeating what I’ve been saying since tghe announcement, but I’d argue that all of those caveats are entirely down to branding and PR and not engineering.
App design, yes. Microsoft made their Timeline 2 so that it actually shows you in the UI all the screenshots that it took from you doing stuff and that’s creepy. Apple doesn’t tell you what they’re pulling and they are almost certainly processing it further to get deeper insights… but they do it in the background so you don’t have to think about it as much.
So again, better understanding of the user, messaging and branding. Same fundamental functionality. Way different reactions.
Yes, but apple doesn’t need to screenshot shit, thats the point, they trained their customers to only use apple apps, where they have full control and force developers to use their AI API to stay relevant.
Microsoft failed to convince user to use microsoft everywhere except with teams and the office suite
Google has the relevant data of most microsoft user, and screenshoting this (like scraping) would have allowed microsoft to get to that data without paying google for it
But that is kinda shady and thus not widely accepted.
The use cases they have presented are literally asking for a picture you received last week that contained a particular piece of text, selecting the text and copying it over.
I know Apple made it seem like AI is magic, but here in the real world that uses real world computers you need to know what’s on the image to do that.
But hey, no, that’s my point. You understand what taking a screenshot of your desktop looks like. You can grok that to the extent that you can feel weird about the idea of somebody doing that to you every five seconds. You can’t wrap your head around the steps of breaking down all your information to the extent Apple is describing. Yeah, they know exactly what you did and when, and what you looked at and what it said and how it relates to everybody you know and to your activity. But since you can’t intuitively understand what that requires you don’t know enough to feel weird about it.
That right there is good UX, even if the ultimate level of intrusion is the same or higher.
This is not screenshoting, the picture is already a picture which the AppleAI has access to
Apple solves it by having the AI deamon running with relatively low rights and analyse stuff directly through a API where apps expose data for it
This is way less bad than just screenshoting everything and as added bonus, apps can give the AppleAI data not even shown on screen, which is impossible with the Screenshot idea.
Hold on, how is this “low rights” if it’s looking at and reading every single file you have in your device AND every single thing you access online or have remotely stored? Surely from a purely technical standpoint looking at the screen is less access by every reasonable metric. You don’t look at it, the AI doesn’t know about it. Right? Do we have a sense of shared reality here?
Don’t get me wrong, that’s still very effective spyware and I certainly don’t want a screenlogger running on my device, Apple or Microsoft. But if you present to me a system that constantly reads every file you access on any capacity and remembers it, displayed onscreen or not, versus one that looks at your screen… well, the one that looks at your screen knows less about you by any measure. OBS can record your screen, but it doesn’t know what the emails you haven’t read while you’re recording say.
The info is easier to extract, easier to be made human readable, definitely creepier in concept, probably easier to exploit. But less intrusive. Can we at least agree on that?
You have other deamons on your device that have more rights. It doesn’t need rights if it gets packages delivered from apps by the API.
Of course a big flaw in apple’s system is, that you don’t exactly know which system app gives what data to your personal appleAI LLM.
So long story short, microsoft should have let your personal LMM be trained by the screenshots and don’t let those screenshots be saved to disk, but only temporarily saved in RAM.
I bet, that the chips from snapdragon aren’t fast enough to achieve that good enough and this is typical microsoft bruthforce problem solving.
Of course, if someone would be able to steal your trained appleAI (like Apple for example) they still can ask anything about you.
I don’t know how apple plans to keep your trained LLM save, but that we will see soon I guess.
Maybe it is stored in iCloud in order to sync with all devices, which of course could be a problem for many people.
I use Arch, btw
I don’t know that this is a matter of performance, considering MS is pushing a specific TOPS spec to support these features. From the spec we have, several of the supported devices Apple is flagging for this feature are below the 40 TOPS spec required for Copilot+. I think that’s more than they’re putting in M4, isn’t it?
Granted, Apple IS in fact sending some of this data to server to get processed, so on that front they are almost certainly deploying more computing power than MS at the cost of not keeping the processing on-device. Of course I get the feeling that we disagree about which of those is the “brute force” solution.
I also think you’re misunderstanding what Apple and MS are doing here. They’re not “training” a model based on your data. That’d take a lot of additional effort. They presumably have some combination of pre-existing models, some proprietary some third party and they are feeding your data into the models in response to your query to serve as context.
That’s fundamentally different. It’s a different step on the process, it’s a different piece of work. And it’s very similar to the MS solution because in both cases when you ask something the model is pulling your data up and sharing it with the user. The difference is that in MS’s original implementation the data also resided in your drive and was easily accessible even without querying the model as long as you were logged into the user’s local account.
But the misconception is another interesting reflection of how these things are branded. I suppose Apple spent a ton of time talking about the AI “learning” about you, implying a gradual training process, rather than “we’re just gonna input every single text message you’ve ever sent into this thing whenever you ask a question”. MS was all “we’re watching you and our AI will remember watching you for like a month in case you forget”, which certainly paints a different mental picture, regardless of the underlying similarities.
I understood it like Apple provides a pre trained LLM and it is then trained on device with user data directly resulting in new weights and configuration for each person‘s personal AppleLLM.
For me that seems more reasonable that way because the data is way less random but strictly orchestrated by the limitations defined by apple through the API that needs to be used in order to integrate your app with the user’s personal AppleLLM
And I still agree, the weights and configuration of the AppleLLM is as critical as 100gb screenshots of your windows, but definitely harder to understand if extracted.
I just don’t think that’s plausible at all. I mean, they can “train” further by doing stuff like storing certain things somewhere and I imagine there’s a fair amount of “dumb” algorithm and programming work going on under the whole thing…
…but I don’t think there’s any model training on device. That’s orders of magnitude more processing power than running this stuff. Your phone would be constantly draining for months, it’s just not how these things work.
I’m not saying anything particularly new and I’m mostly repeating what I’ve been saying since tghe announcement, but I’d argue that all of those caveats are entirely down to branding and PR and not engineering.
App design, yes. Microsoft made their Timeline 2 so that it actually shows you in the UI all the screenshots that it took from you doing stuff and that’s creepy. Apple doesn’t tell you what they’re pulling and they are almost certainly processing it further to get deeper insights… but they do it in the background so you don’t have to think about it as much.
So again, better understanding of the user, messaging and branding. Same fundamental functionality. Way different reactions.
Yes, but apple doesn’t need to screenshot shit, thats the point, they trained their customers to only use apple apps, where they have full control and force developers to use their AI API to stay relevant.
Microsoft failed to convince user to use microsoft everywhere except with teams and the office suite
Google has the relevant data of most microsoft user, and screenshoting this (like scraping) would have allowed microsoft to get to that data without paying google for it
But that is kinda shady and thus not widely accepted.
But they do, though.
The use cases they have presented are literally asking for a picture you received last week that contained a particular piece of text, selecting the text and copying it over.
I know Apple made it seem like AI is magic, but here in the real world that uses real world computers you need to know what’s on the image to do that.
But hey, no, that’s my point. You understand what taking a screenshot of your desktop looks like. You can grok that to the extent that you can feel weird about the idea of somebody doing that to you every five seconds. You can’t wrap your head around the steps of breaking down all your information to the extent Apple is describing. Yeah, they know exactly what you did and when, and what you looked at and what it said and how it relates to everybody you know and to your activity. But since you can’t intuitively understand what that requires you don’t know enough to feel weird about it.
That right there is good UX, even if the ultimate level of intrusion is the same or higher.
This is not screenshoting, the picture is already a picture which the AppleAI has access to
Apple solves it by having the AI deamon running with relatively low rights and analyse stuff directly through a API where apps expose data for it
This is way less bad than just screenshoting everything and as added bonus, apps can give the AppleAI data not even shown on screen, which is impossible with the Screenshot idea.
Hold on, how is this “low rights” if it’s looking at and reading every single file you have in your device AND every single thing you access online or have remotely stored? Surely from a purely technical standpoint looking at the screen is less access by every reasonable metric. You don’t look at it, the AI doesn’t know about it. Right? Do we have a sense of shared reality here?
Don’t get me wrong, that’s still very effective spyware and I certainly don’t want a screenlogger running on my device, Apple or Microsoft. But if you present to me a system that constantly reads every file you access on any capacity and remembers it, displayed onscreen or not, versus one that looks at your screen… well, the one that looks at your screen knows less about you by any measure. OBS can record your screen, but it doesn’t know what the emails you haven’t read while you’re recording say.
The info is easier to extract, easier to be made human readable, definitely creepier in concept, probably easier to exploit. But less intrusive. Can we at least agree on that?
You have other deamons on your device that have more rights. It doesn’t need rights if it gets packages delivered from apps by the API. Of course a big flaw in apple’s system is, that you don’t exactly know which system app gives what data to your personal appleAI LLM. So long story short, microsoft should have let your personal LMM be trained by the screenshots and don’t let those screenshots be saved to disk, but only temporarily saved in RAM. I bet, that the chips from snapdragon aren’t fast enough to achieve that good enough and this is typical microsoft bruthforce problem solving. Of course, if someone would be able to steal your trained appleAI (like Apple for example) they still can ask anything about you. I don’t know how apple plans to keep your trained LLM save, but that we will see soon I guess. Maybe it is stored in iCloud in order to sync with all devices, which of course could be a problem for many people. I use Arch, btw
I don’t know that this is a matter of performance, considering MS is pushing a specific TOPS spec to support these features. From the spec we have, several of the supported devices Apple is flagging for this feature are below the 40 TOPS spec required for Copilot+. I think that’s more than they’re putting in M4, isn’t it?
Granted, Apple IS in fact sending some of this data to server to get processed, so on that front they are almost certainly deploying more computing power than MS at the cost of not keeping the processing on-device. Of course I get the feeling that we disagree about which of those is the “brute force” solution.
I also think you’re misunderstanding what Apple and MS are doing here. They’re not “training” a model based on your data. That’d take a lot of additional effort. They presumably have some combination of pre-existing models, some proprietary some third party and they are feeding your data into the models in response to your query to serve as context.
That’s fundamentally different. It’s a different step on the process, it’s a different piece of work. And it’s very similar to the MS solution because in both cases when you ask something the model is pulling your data up and sharing it with the user. The difference is that in MS’s original implementation the data also resided in your drive and was easily accessible even without querying the model as long as you were logged into the user’s local account.
But the misconception is another interesting reflection of how these things are branded. I suppose Apple spent a ton of time talking about the AI “learning” about you, implying a gradual training process, rather than “we’re just gonna input every single text message you’ve ever sent into this thing whenever you ask a question”. MS was all “we’re watching you and our AI will remember watching you for like a month in case you forget”, which certainly paints a different mental picture, regardless of the underlying similarities.
I understood it like Apple provides a pre trained LLM and it is then trained on device with user data directly resulting in new weights and configuration for each person‘s personal AppleLLM. For me that seems more reasonable that way because the data is way less random but strictly orchestrated by the limitations defined by apple through the API that needs to be used in order to integrate your app with the user’s personal AppleLLM
And I still agree, the weights and configuration of the AppleLLM is as critical as 100gb screenshots of your windows, but definitely harder to understand if extracted.
I just don’t think that’s plausible at all. I mean, they can “train” further by doing stuff like storing certain things somewhere and I imagine there’s a fair amount of “dumb” algorithm and programming work going on under the whole thing…
…but I don’t think there’s any model training on device. That’s orders of magnitude more processing power than running this stuff. Your phone would be constantly draining for months, it’s just not how these things work.