It's understandable that when things change as fast as they do today, we need a little bit of our ideas on how things work to compensate them. A misunderstanding deserves to be resolved because it's so sensitive, it's the suggestion that Apple (or Google or someone else) maintains somewhere a special folder where all your naughty pictures are kept. You're right to be suspicious, but fortunately that does not work.
What are these companies scanning in one way or another? They use sophisticated image recognition algorithms that can easily detect everything from dogs and boats to faces and actions.
When a dog is detected, a "dog" tag is added to the metadata that the service is tracking relative to that photo - in addition to things like the time the photo was taken, the exposure settings, the location, and so on. It's a very low level process - the system does not really know what a dog is, only that photos with some associated numbers (corresponding to different visual characteristics) get this label. But now you can search for these things, and he can easily find them.
This analysis usually takes place within a sandbox, and very little of what systems determine makes it outside of that sandbox. Of course, there are special exceptions for things like child pornography that have very special classifiers that are explicitly allowed out of this sandbox.
The sandbox needs to be big enough to host a web service. You will receive your photos with their content when you download the Google Photos, iCloud or other. That is no longer the case.
Due to the improvements in the world of machine learning and processing power, the same algorithms that were previously run on batteries of huge servers are now strong enough to run directly on the phone. So now your photos get the "dog" day without having to send them to Apple or Google for analysis.
This is probably a much better system in terms of security and privacy - no longer using someone else's device to check your privacy and trusting them to keep it private. You should always trust them, but there are fewer parts and steps to trust - a simplification and shortening of the "chain of trust".
But express that user can be a challenge. What they see is that their private photos - perhaps very private - were assigned to categories and sorted without their permission. It's hard to believe that this is possible without a company not putting its nose there.
Part of it is the user interface error. When you browse the Photos app on iPhone, it will display what you have searched for (if available) as a category. This indicates that the photos are somewhere in a "folder", probably as a "car" or "swimsuit" or whatever. What we have here is a failure to communicate how the research actually works.
The limitation of these photo classification algorithms is that they are not very flexible. You can train one to spot the 500 most common objects in the photos. However, if your photo does not contain one, it will not be marked. The "categories" displayed in the search are the common objects by which the systems are trained. As mentioned above, this is a pretty rough process - in fact, just a threshold of trust that some objects are in the picture. (In the picture above, for example, the picture of me in a soundproof room was called "cardboard", I suppose because the walls look like milk cartons?)
The whole "file" thing and most ideas about how files are stored in computer systems today are anachronistic. But those of us who grew up with the desktop nested folder system often think so, and it's hard to think of a photo container as something other than a folder - but the folders have certain connotations of creation, access and administration this does not apply here.
Your photos are not placed in a container labeled "swimsuit" - it only compares the text you wrote in the box with the text in the photo's metadata, and when the swimsuits are recognized, it lists those photos.
This does not mean that the companies concerned are completely exempt from any questioning. Which objects and categories are looking for these services, what is excluded and why? How have their classifiers been trained and are they, for example, effective in people with different skin colors or genders? How to control or disable this feature or why not?
Fortunately, I've contacted several of the largest technology companies to ask them some of these questions and explain their answers in a future article.
No comments:
Post a Comment