The missing Hand in creating a truly robot-powered economy

Vivek Rajasekaran
6 min readApr 9, 2022

A surprising effect of the exponential increase in computing power and advances in software technology is that the automation of non-trivial cognitive tasks is now much simpler than the automation of seemingly trivial physical tasks. Find if 2874901873 is prime? Simple. Show the best route to take given the current traffic from Electronic city to the Bangalore airport? Piece of cake. Show me recommended shirts which look similar to the one the guy sitting next to me is wearing? Done. Take a milk packet from the fridge and boil it? Sorry, not possible.

We have the tools and the methods to automate any problem which can be reduced purely to the software realm. But we run into challenges the moment a problem strays into the territory of doing some work in the physical world. A new framework and technology are needed for solving such problems generically. Before we get there though, there is one major gap that needs to be bridged. This post is about this gap and the bridge building efforts that are underway.

Any intelligent process follows the below cycle (with complex processes having sub-cycles of the same nature).

Of these steps, the sensing can be considered to be solved for many real world settings. Sensors for sight and sound have been around for a long time and don’t cost much.

Processing (analysis and decision making) of the sensed data is also either solved or rapidly getting solved. While rule-based algorithms to understand data and take decisions rapidly have been around for a few decades now, more complex pattern matching to make sense of pictures and video streams have also developed well in the last decade through the advances in deep neural networks and the GPUs to train and run them. The algorithms are now good enough to match human level “instinctive” decisions for scenarios where a trained person needs less than a couple of seconds to interpret and decide on an action. For taking longer term continuous decisions in complex environments, the research work is still in progress. But many industrial and home problems can be broken down to a sequence of “instinctive” decisions as long as the environments are not very dynamic.

The last part of acting on the decision taken is where the problem gets into the physical domain. Accomplishing an action within machines (such as in factories) is simpler as it typically involves activating some mechanism through electrical signals. However, in many human cohabited environments, accomplishing an action involves working directly on objects or working through tools. For example, boiling milk in a kitchen involves multiple steps where physical manipulation is needed.

Now, you might argue that this is a human way of boiling milk. A robot does not need to be designed to copy what humans do. There might be more efficient ways of automating the process. And you would be right. Vacuum cleaners don’t work the way humans clean a home. Dishwashers don’t replicate how humans wash dishes. In fact, pretty much all the industrial and home automation over the last few centuries have involved creating sophisticated machines which are much more faster and effective than human processes and use different principles. However, these developments come with a catch. They are all “specialized” machines. They do one thing and they do it really well. In contrast, human actions are versatile but not optimized. We can juggle balls, weave clothes, play the violin, bake cakes and perform surgeries. While there will continue to be more specialized machines that will keep doing their one thing really well, I believe the next leap in productivity would happen with the creation of machines that might not be best at what it does but can do many things reasonably well. Essentially, a general purpose robot that can have an impact similar to the general purpose computer that can support many applications. This would make economic sense in environments where there are multiple varieties of jobs to be done (such as in a home) or where each job needs to handle multiple variations of objects (such as picking and packing groceries against an order).

Physically, there are two major things that a general purpose robot needs to be able to do — move seamlessly and manipulate objects flexibly. Robots can now move seamlessly (for example, see this robotic dog and this flipping robot), However, they still can’t flexibly manipulate objects. The missing link is the Hand. We don’t yet have an automated hand that can rival our natural endowment.

It is no coincidence that marketers use the term “hand-made” to contrast products created by humans from those made by machines and try to invoke an emotional connection with consumers. Hands are complex. They enable both powerful and intricate actions. We can grasp, hold, pull, push, tap, clap, punch, pinch, slap, turn and do a myriad of other things. Amongst organs, they punch above their weight.

These weird looking models are Homunculus-es. They show each body part in proportion to the amount of brain area dedicated to sensing (first / left pic) and acting (second / right pic) for the body part. We can get an idea of the immense work that goes into providing the sensitivity and nuanced flexibility in the palm and fingers. Image source: FineArtAmerica (1, 2)

So where exactly are we in the quest to build a robot hand that can do all the awesome things that our hands can do? Things are not so impressive just yet but there are smart folks getting into the field and the area has moved from primarily academic research to both startups and large tech companies actively working on the problem. The e-commerce boom is one driving reason for companies seeking to automate their supply chains to reduce operating costs. While the production of manufactured items in factories are largely automated, the warehouse operations needed to pick and pack items are human driven. The fulfillment and distribution centers are filled with jobs where people need to just pick the right things, put them in appropriate packages and send them on the way. This is the first area where we can expect to see the robots with versatile hands.

Dexterous hand (from OpenAI’s blog)

Shadow Robot’s Dexterous Hand has flexibility similar to that of a human hand (measured through in terms of degrees of freedom) and is currently targeted for tele-operations use cases (human operation at a distance in hazardous environments) but this can be expected to be fitted on robots. OpenAI has used reinforcement learning on this hand to teach different grasping techniques. This can be fitted with tactile sensors such as those provided by Syntouch for sensing force, vibration and temperature. This is currently used in prosthetics research.

Stretch in action (source: youtube)

Suction based robotic arms powered with computer vision are already starting to get deployed. These are sufficient for relatively simpler tasks like picking and placing. Stretch from Boston Dynamics’ products, can already automate loading and unloading jobs. Skypicker, from Exotec, can pick a high range of packaged grocery items.

When this becomes mainstream, we can expect a transformation of all our living spaces. It will trigger a rapid automation of most blue collar work as well as automation of the physical tasks involved in gray collar work. We can imagine the bots getting created by some companies while the skills for performing different tasks developed by a plethora of companies competing with each other in a new version of the app store. Farms, warehouses, homes and maybe even hospitals would start getting powered by the new hand.

--

--

Vivek Rajasekaran

Long stories on stuff I know (product management and tech businesses) and short stories on everything else.