This week, a startup focused on AI training, Shift, announced a unique offer: it will clean homes in New York City at no cost. The company plans to expand this service to other major cities, including London, and it’s easy to see why such an offer would be appealing to many.
However, there’s a stipulation to this generous offer, as one might expect.
In exchange for the complimentary cleaning, Shift seeks to capture video footage of its cleaners in action. This includes tasks like washing dishes, wiping down counters, dusting surfaces, and mopping floors. Essentially, they want a comprehensive visual record of routine domestic chores—chores that many would gladly delegate if possible. Robotics companies are keen to teach machines these tasks so they can market solutions to take over these duties for us.
Yet, this is no easy feat. Unlike chatbots or AI image generators that have rapidly advanced, robots face the challenge of interacting with the physical world. They must comprehend spatial orientation, movement, force, friction, and handle diverse shapes, materials, and lighting conditions—nuances that humans and other living creatures navigate almost instinctively. Simple tasks for humans, such as folding clothes or pouring a drink, remain complex challenges for robotic developers.
To teach machines these skills, a substantial amount of data is required. While text, images, and videos can be harvested from the internet on a large scale, often without crediting creators, the physical world poses a more significant challenge. It’s difficult to gather quality data discreetly and without cost, making it a significant hurdle for companies advancing physical AI. Thus, firms like Shift are exploring inventive strategies to overcome this barrier.
Shift isn’t the only player in this game. In India, reports have surfaced about Pronto, a home services platform, using clients’ homes to gather AI training footage for tasks such as cooking, cleaning, and doing laundry. Pronto asserts that it records footage only with explicit customer consent, although it’s unclear what customers receive in exchange, aside from the footage itself. This practice has sparked controversy, prompting competitors to clarify that they do not record inside homes for AI training purposes and have no plans to start.
Other startups are focused on trying to scale data collection. Silicon Valley-based Human Archive, for example, hopes to partner with companies like Pronto and have gig workers record their activities using not-so-stylish camera caps. The hats collect footage from the wearer’s point of view, exactly the kind of “egocentric” or first-person data robotics companies need to teach machines how people navigate physical space. Shift, meanwhile, also taps consumers directly, and claims to have paid tens of thousands of people across 15 countries to record their activities through its app.
Some companies are skipping useful work altogether. Instead, workers are paid to complete the exact same physical tasks again and again while cameras and sensors can capture every movement. Such staged data farms are designed to turn rote physical activity — folding towels, picking up cups, carrying boxes — into AI training material valuable enough to justify paying people to create it.
And some data is generated by robots already out in the world. Despite the hype, true automation is still a long way away — hence the need for all this data — but companies are keen to ship products anyway. They’ll use data from customers’ homes to improve the product. Many companies rely on remote workers to step in when the robots inevitably get stuck. They’ll use that data too.
Of course, the act of trading data for something of value is not new. Companies have been offering discounts, convenience, and free services in exchange for access to your data for years, from loyalty cards and cookies to dashcams, insurance apps monitoring how people drive, and that heinous smart TV that’s always showing ads.
What’s new is the kind of data companies are willing to pay for. For now, that means maybe letting a human clean your home in a snazzy hat for free so that, eventually, a company can sell you a robot to do it instead.