Computer vision is moving into the mainstream. High-resolution, low-cost cameras are proliferating, found in products like smartphones and laptop computers. New software algorithms for mining, matching and scrutinizing the flood of visual data are progressing swiftly.
The technology can be used in hospitals, shopping malls, schools, subway platforms, offices and stadiums. Machines never blink.
All of which could be helpful - or alarming.
“Machines will definitely be able to observe us and understand us better,” said Hartmut Neven, a computer scientist and vision expert at Google. “Where that leads is uncertain.”
Google has been both at the forefront of the technology’s development and a source of the anxiety surrounding it. Its Street View service, which lets Internet users zoom in on a location, faced privacy complaints. Google will blur out people’s homes at their request.
With Google’s Goggles application, people can take a picture with a smartphone and search the Internet for matching images. The company’s executives excluded a facial-recognition feature, which they feared might be used to find personal information on people.
Scientists predict that people will increasingly be surrounded by machines that can not only see but also reason about what they are seeing.
The uses, noted Frances Scott, an expert in surveillance technologies, could allow the authorities to spot a terrorist, identify a lost child or locate an Alzheimer’s patient who has wandered off.
Millions of people now use products that show the progress that has been made in computer vision. The major online photo-sharing services have all started using face recognition.
Kinect, an add-on to Microsoft’s Xbox 360 gaming console, is a striking advance for computer vision in the marketplace. It uses a digital camera and sensors to recognize people and gestures; it also understands voice commands.
With Kinect, “ technology more fundamentally understands you, so you don’t have to understand it,” said Alex Kipman, an engineer who helped design it.
‘Please Wash Your Hands’
Three months ago, Bassett Medical Center in Cooperstown, New York, began experimenting with computer vision. Small cameras on the ceiling monitor patients’ movements and track people going in and out of the room.
The first applications of the system, designed by General Electric, are reminders and alerts. Doctors and nurses are supposed to wash their hands before and after touching a patient; lapses contribute significantly to hospital- acquired infections, research shows. When someone forgets, a voice declares, “Pardon the interruption. Please wash your hands.”
The system can recognize movements that indicate when a patient is in danger of falling out of bed, and alert a nurse. More features can be added, like software that analyzes facial expressions for signs of severe pain or other distress, said Kunter Akbay, a G.E. scientist.
It is too early to say whether the computer vision will be cost-effective.
Mirror, Mirror
Daniel J. McDuff, a graduate student, stood at a two-way mirror at the Massachusetts Institute of Technology’s Media Lab. After 20 seconds , a figure - 65, the number of times his heart beat per minute - appeared on the mirror. Behind the mirror, a Web camera fed images of Mr. McDuff to a computer whose software tracked the blood flow in his face.
The software separates the video images into three channels - for the basic colors red, green and blue. Changes to the colors and to movements made by tiny contractions and expansions in blood vessels in the face are not apparent to the human eye.
“Your heart-rate signal is in your face,” said Ming-zher Poh, an M.I.T. graduate student. Other vital signs, including breathing rate and blood pressure, should leave similar clues.
The pulse-measuring project, described in research published in May by Mr. Poh, Mr. McDuff and Rosalind W. Picard, a professor at the lab, is just the beginning, Mr. Poh said. Computer vision and clever software, he said, make it possible to monitor humans’ vital signs at a digital glance. Daily measurements can reveal that, for example, a person’s risk of heart trouble is rising. “In the future it will be in mirrors,” he said.
Faces can yield all sorts of information to computers. At M.I.T, Dr. Picard and a research scientist, Rana el-Kaliouby, have applied facial-expression analysis software to help people with autism better recognize emotional signals.
The two women founded Affectiva, a company in Waltham, Massachusetts, that is marketing its facial-expression analysis software to manufacturers of consumer products, retailers, marketers and movie studios.
John Ross, chief executive of Shopper Sciences, a marketing research company, said Affectiva’s technology promises to give marketers an impartial reading of the sequence of emotions that leads to a purchase. “You can see and analyze how people are reacting in real time, not what they are saying later, when they are often trying to be polite” in focus groups, he said. The software, Mr. Ross said, could be used in store kiosks or with Webcams. Shopper Sciences, he said, is testing Affectiva’s software with a major retailer and an online dating service.
Watching the Watchers
Maria Sonin, 33, an office worker in Waltham, Massachusetts, watched a movie trailer while Affectiva’s software calibrated her reaction. The software tracked movements on a couple of dozen points on her face . To the human eye, Ms. Sonin appeared to be amused. The software agreed, said Dr. Kaliouby, though it used a finer-grained analysis, like recording that her smiles were symmetrical (signaling amusement, not embarrassment) .
Christopher Hamilton, a technical director of visual effects, says facial- expression analysis technology “makes it possible to measure audience response with a scene-by-scene granularity that the current surveyand- questionnaire approach cannot.” A director could find, for example, that although viewers liked a movie over all, they did not like two or three scenes. Or he could learn that a particular character did not inspire the intended emotional response.
The challenge of computer vision arises from the rapid spread of lessexpensive yet powerful technologies.
At work or school, the technology opens the door to a computerized supervisor that is always watching.
More subtle could be the effect of a person knowing that he is being watched. It could be beneficial: a person reconsiders before committing a crime. But might it also lead to a society that is less spontaneous, less creative, less innovative?
As Hany Farid, a computer scientist at Dartmouth College in New Hampshire, said, “With every technology, there is a dark side.”
By STEVE LOHR