How teaching robots the way the world works changes the world of work by Harvard Business School

Robots aren’t necessarily primed to take over, but advances in machine learning are readying the mechanical components of the workforce for more complex and autonomous tasks. Startup Osaro specializes in deep reinforcement learning systems, artificial intelligence for industrial robots. CEO Derik Pridmore talks about the adaptive decision-making capabilities working their way into warehouses and factories, and the prospect of machines with a wider, more human range of cognitive capabilities.

Bill Kerr: Images of the future commonly include powerful robots that can serve our every need. Robots today are a long way from those visions, however. They generally perform highly repetitive tasks, often involving little variation in very structured environments. The vast majority, as this episode’s guest would say, are blind and dumb. Recent advances in artificial intelligence, however, are changing this and expanding the boundaries of what robots can do.

Welcome to the Managing the Future of Work podcast from Harvard Business School. I’m your host, Bill Kerr. Today I’m speaking with Derik Pridmore, cofounder and CEO of Osaro, a company that makes brains for robots. Derik will tell us about how machine-learning techniques can revolutionize robotics and what his company is doing to turn this potential into reality. Welcome, Derik.

Derik Pridmore: Thanks. Thanks for having me.

Kerr: Derik, you left a top-notch job in finance because you’re excited about this potential area of technology. Tell us about what dazzled you.

Pridmore: I guess I’d start by saying I consider myself to be a technologist. Originally, I studied physics down the road at MIT and was always interested in that. Sci-fi geek, I really went into finance sort of as a function of happenstance and was excited to move back in that direction. I worked at a venture fund, invested in technology startups, and see this as the future.

Kerr: One of those technology startups was DeepMind.

Pridmore: Um-hm.

Kerr: That was a very transformative venture. Tell us a little bit just about DeepMind and what you saw through that investment.

Pridmore: To be honest, when we invested, it was just three guys and an idea. They painted a compelling picture of where we could take AI, and it made sense to me. We met the company in 2010. It’s a London-based AI startup started by Demis Hassabis, Shane Legg, and Mustafa Suleyman. The ideas were to take inspiration from the brain, and not just deep learning or reinforcement learning, but really anything that they can glean—essentially ideas from neuroscience—and try to solve intelligence, which was not a good business pitch, but they had already mapped out a pretty interesting roadmap around proving functionality that we know as a species exists in the brain and using games to do that.

Kerr: And this ultimately had a very successful acquisition by Google. On the other side, what do you think Google was seeing in this technology, or how would Google want to absorb it into the company?

Pridmore: Google looked at some of the demos that DeepMind had put together—which were certainly cutting-edge state-of-the-art for the time—and saw that they would have broad-based applicability, whether it’s doing image search or ultimately some of the stuff they did around data-center optimization. Google has a ton of problems that are very hard that need data to solve them, and that’s what machine learning is good at. So, I think that sort of pragmatic, functionality-focused approach made the acquisition obvious. But then, it turns out that the founders of Google considered Google from day one to have been an AI company and when they met Demis, the quote was, “These guys are hell-bent on building AI. We have to get them.”

Kerr: Okay. It was a match and one of deep chemistry there.

Pridmore: Yeah.

Kerr: Deep reinforcement learning. Could you just say a little bit more about what part of the human brain or activity is this mimicking, and how does it help the computer or the robot learn new stuff?

Pridmore: I’ll caveat that I originally trained as a physicist and then got a lot of brain damage from doing finance, but I work with some very smart machine-learning engineers, so I hope they will be okay with my explanation. Deep learning is loosely inspired on the way the brain works. The brain is a bunch of connections of neurons, and the firing of one affects the other, and the more a neuron fires its propensity to fire again, changes. That’s more or less how deep learning is designed to work. You take very high-dimensional input like a picture and you essentially compress it down to low-dimensional input, like a one or a zero. You have to do that in kind of a fuzzy way. And you do it by showing this network of connections, picture after picture, and then updating the weights of that graph to sort of be able to better predict the outputs that you want.

Another way of thinking about it is it’s function approximation. Deep learning generally falls under the heading of supervised learning—for instance, how you play a game, what are the sequence of actions that you take in order to win the game? That would be a reinforcement-learning problem. The idea is, you might need to take many thousands of actions before something happens. You get a reward. And you need to decide how all of those actions in the past contributed to the win. If you can do this enough times with enough data, there are algorithms that will, in certain scenarios, be able to learn how to control a game. So you can put these two things together, then you can hand those things off to a reinforcement learning–based controller that can take those as inputs and still do these sequential actions that it needs to plan to figure out how to achieve a goal. The interesting thing about deep reinforcement learning is, you don’t do these things separately. You actually have to train them at the same time. That was sort of the breakthrough—that you could learn both your perceptual understanding and your control policy at the very same time.

Kerr: Which is, in reality, how humans learn. Learning to walk, learning to observe the world. One- to two-year-olds, as they’re kind of making their way around, are trying to learn what are the various objects out there and also trying and failing at early steps and walking.

Pridmore: You’re bumping into things, you’re falling down. Exactly.

Kerr: You mentioned Osaro’s early days. You were searching around for the right product space. Tell us about the things that you considered and then how you ultimately came to focus on robotics.

Pridmore: Deep reinforcement learning has the potential to enable products in a wide variety of areas. So we looked at a couple of different things. I would say, in retrospect, we started the company the way you shouldn’t. You should start with the product. You should probably have customers. We started with a technology that we knew could be applied to a lot of different areas. Specifically, our idea was to use some of the early techniques, essentially demos around deep reinforcement learning and playing games to do just that. The idea was to use them to debug games. Rather than have an Atari agent play Atari and strictly try to run up the score, you could augment its reward function and give it essentially points for crashing the game. And so, if you could build a system that would do that, you could replace a lot of the manual gameplay that happens while you’re testing games. It actually slows the process down. So right now, when people build games, they will go to countries like India or sometimes Canada, and they’ll have people play games. That was one, game debugging.

Another was ad optimization. The idea there was ad optimization is in some sense a reinforcement-learning problem, meaning it’s a delayed-reward problem. You do something up front, and you expect something to happen many steps later. But it’s also a sequential decision-making problem, meaning the true state of a person’s brain is affected by everything that happens to them. Even something as simple as changing the order in which I show you ads or posts, for instance, on Facebook affects how you react to them. Social psychologists have known about this effect for a long time. So, it makes sense to try to take those things into account. So we looked at doing that sort of stuff.

And then robotics was the other big one. Robots are essentially sold today without a brain, and we can talk more about that. To usefully apply them to new domains, you need to be able to do closed-loop control, meaning take perceptual inputs from a camera and then control the robot. That’s the kind of thing that deep learning is very good at. There’s also this control aspect, which is sort of similar to reinforcement learning.

Kerr: You then take this—in the robotic space you start with warehouse piece-picking. Can you tell us a little bit about that space, what attracted you to it, and what are your products doing in there right now?

Pridmore: Once we had determined that robotics was a very interesting space—because there was high demand, there weren’t a lot of players in the space, and also there were no big companies who had sort of a proprietary advantage around data, for instance—we started looking for the easiest use cases we could find, to be honest. We wanted to sell a product, and we want it to be something that works and that doesn’t put a huge technical burden on us. We’re definitely doing deep tech, and we have a really smart team who do things like deep reinforcement learning. But having said that, we want to make the problem as simple as possible. So, we went out and started looking at all the use cases. You can imagine lots of things—from very precise manufacturing of cellphone components to things like warehouse piece-picking. The reason why we settled in on that was, simply, it’s easier. You need to pick an object up from a bin and move it to another bin and pack it. The idea, by the way, is that this is all of the infrastructure that backs up e-commerce. When you order something on a website, and it’s sitting in a warehouse somewhere, you order a couple of things, they’re in different parts of the warehouse, someone needs to go get them, put them in one box, and ship it to you. A lot of that process is manual still today, but it’s very rapidly becoming automated in order to make the entire process efficient and also because a lot of markets have labor shortages. I think Japan, for instance, is a very key market for us, they have a population decline of around 1% a year, which is 5,000 workers a day.

Kerr: There’s a relentless pressure also to push prices down. The more we can automate or lower prices through less labor input, that makes all of our packages from Amazon a bit cheaper.

Pridmore: Yep, that’s right. The more we can shift from a mode where we’re paying people every day to run around to one where we buy one machine that sorts things and puts them together, then that reduces prices.

Kerr: You said this was a little bit easier than creating that cellphone component, but it also sounds like this has been a very challenging space. Amazon has had sort of picking contests and other stuff through the years to help with this particular final stage of the warehouse automation. What is it that connects with deep reinforcement learning here? What can Osaro do that the other products could not have done?

Pridmore: To some extent—it’s not just Osaro, there are a lot of companies who are working this space—it’s a function of the technology progressing to the point where it’s possible. Sometime around 2017, which is roughly when Amazon decided to stop having the Amazon picking challenge, we saw that the deep-learning models that we were using for perception had reached a point where they were accurate enough and had enough scale to them—meaning they could label and identify a wide enough variety of products to be commercially useful—at that point, it went from being a very hard theoretical machine-learning problem to a pretty hard engineering problem, but a tractable one.

Kerr: Continue on Amazon, they purchased Kiva seven years ago, which helps move the devices closer to the human agents that are picking them up and putting them in the packages. Yet Amazon, overall, some estimates have said less than one out of five of their warehouses are deeply automated. What is slowing that adoption process down?

Pridmore: Many things. I think Amazon has a particular problem. Taking a few steps back, there are lots of different kinds of material-handling facilities and lots of different kinds of shipping. You might be shipping just boxes, or you might be fulfilling very specific products, like pills or cosmetics. That range of products has varying levels of difficulty. Amazon has the worst version of it in that they do everything. So it’s not surprising that they’ve only automated a small section of their problem.

The other issue Amazon has is that, again, because they’re doing so much at such a large scale they have a seasonality problem, where a lot of their demand is during basically the holiday season. They have to spin up new facilities, and that’s actually what Kiva was designed for. That system is what you might call “reverse compatible,” meaning you can go into a big, open warehouse and very quickly turn it into a facility that is moving things around.

Now, having said that, it’s not necessarily the most efficient warehouse. Kiva specifically, they’re very small sort of maybe six-inch-high, circular robots that roll around, and they’ll roll under a shelf, pick it up, and move an entire shelf to a person. If you just sort of think as a physicist and you’re trying to design an efficient system, that’s not exactly the one you would design, because if I need to get a little package of tooth floss off of a shelf, the most energetically efficient way to do that is not to bring an entire shelf of products to me and then grab the floss. But it is efficient in the sense that it allows you to spin up a new facility as rapidly as possible. One of the things we spend time thinking about is, if you could redesign the entire matter-routing infrastructure of the world to be energetically efficient, how would you do it? That’s one of the reasons why, as a company when we started to sell our product, we focused in on rack storage systems, where the products are pretty much the only thing that moves, so they’re stored in very small bins that have just the product in it. The idea is that magic would be, if I want that floss, it just sort of flies off the shelf and comes to me. So the only thing that moves is the floss, not the entire shelf. These rack storage systems are pretty much as close as you can get to that. But again, Amazon has a very wide range of problems, and the technology has only just progressed to the point where you can start to automate—not just the movement of big things like the racks, but also the individual piece-picking. So they’re just getting started.

Kerr: An important part of your work is that in the warehouse setting, there would be many, many types of products—10,000, 20,000, 30,000 types of products. And also that the precision can be okay—like, you don’t have to be as precise as you need to be in some manufacturing applications. You can kind of do well with a little bit less. Talk to us a little bit just about some of the technical challenges of needing to pick up 10,000 products or 20,000 products. How do you handle that level of complexity or that variation?

Pridmore: It’s not just the variation and the products, but also the packaging type. One of the things you learn really quickly as you sort of focus in on your customer base and you start looking at this piece-picking problem is that if you were looking at it from just a machine-learning perspective—are almost adversarially packaged. So they’re clear things, wrapped in clear things, wrapped in foil.

Kerr: They’re just making it hard on you.

Pridmore: Yeah, exactly. Just making it as hard as they can. So that was one of the challenges when we started the company. We’ve had opportunities for making decisions about whether we want to do something in software, hardware, and our bet has always been that we could do it in software, because the machine-learning algorithms will be powerful enough. An example of that is, when you’re looking at clear items, you could try to build a special sensor that does that, or you could say, “Look, let’s do it the way people do. Let’s just use color cameras, and maybe use two of them.” That’s one of the challenges we had to sort of face. Again, we’ve done that with a lot of deep learning, so that’s a primary technique we use on the vision side. The range of objects, which you’ve mentioned as well, is also challenging. Being able to build models that can identify huge numbers of objects is essentially the bread and butter of deep learning in some sense, but at the same time, most of the techniques that are published academically are built on datasets that essentially have hundreds or thousands of objects. They don’t have tens of thousands or millions. That’s been an area of ongoing research. We do our own research internally at Osaro. We also solve some of the challenges with engineering.

Kerr: What do you think about applying your software, your techniques, at the device level or in the cloud? Or do you mix between those?

Pridmore: Most of these decisions are driven by product requirements, meaning you certainly have to be able to control the robot and identify the range of products that’s necessary. Part of that means you can’t have much latency, so you need to be right next to the robot, because you’re communicating over an Internet connection that’s like a local LAN [local area network]. Even hundreds of milliseconds can translate into centimeters of robot movement, so you can’t be in the cloud coming down to the edge. But having said that, the models that we deploy there are actually quite computationally intensive to create. It can take days of training across a cluster of GPUs [graphics processing units] to get one of these deep-learning models to the point where it’s accurate enough to be deployed. We actually have a hybrid system, where we have cloud infrastructure where we can send data out to the cloud, combine it with our existing data, train these models sometimes over days, but then we deploy them at the edge, and they’re running in what’s called “inference mode” on the machine. That’s one of the distinctions between, in deep learning that I think may be lost in some people. There’s a difference between training and inference. Training is very computationally intensive. It can take a long time. It’s very energetically intensive, but once it’s trained, these models are actually very fast, and they’re pretty lightweight. You can take something that might need days across five machines in the cloud to train, but it winds up being a file that’s only a few megabytes, and it sits on the GPU, and it runs very quickly.

Kerr: As you thought about both this particular application and also the comment you made earlier, there’s the need to have the top-notch engineers. This is a space where the Googles of the world have regularly gobbled up as many AI researchers as they possibly can. How have you pulled off building up a team at Osaro?

Pridmore: By looking all over and finding the smartest people we can and keeping the bar high. Once you reach a certain critical mass of really smart people, they only want to work with other smart people, and that helps you with recruiting. I think one of the challenging things has been some of these bigger companies like Facebook and Google essentially pay people not to work at times. They are paying people to publish papers, and that’s fine. When we interview people, and they’re looking to do more of a research type job, we just tell them that’s not what we do here. We build products. I think the people that we do get realize that there are extremely hard challenges in getting products to work. In some ways, it’s a metric that you can’t sort of spin your way around. What we find—and I think actually even some of the researchers in the field are frustrated by this—is that there’s this explosion of papers, but in some sense the quality is starting to go down, and it’s hard to know actually what works. When you’re building a product it’s really simple. It’s, like, if it doesn’t work …

Kerr: You know what works.

Pridmore: Exactly. I think when you get people who are in some sense skeptical of what’s out there but they know that they have what it takes, and you get a critical mass of those people together, then it sort of starts to feed on itself.

Kerr: Yeah. That’s not just this field. A number of people in the pharmaceutical industry have commented that many of the research papers don’t have findings that they can reproduce later on, and they have to have this friction of frontier sort of science work with, actually, let’s build a product. So, as you think about moving from the warehouse piece-picking, what do you see as the future of Osaro? How are you going to define kind of the next industry or spaces to go into?

Pridmore: Wow! That’s a big question. I think we spend a lot of our time focusing on the next six months, 12 months, 18 months. Long term, again, one of the reasons why we knew we wanted to do this as a software company is that the hardware is changing so rapidly, and part of that is enabled by new algorithms. If you look at the robots that we’re building today, they’re essentially repurposed from the automobile industry, where you had to build an extremely stiff robot so that you could know, if I put exactly this many amps of current into this motor, then the arm will move by exactly this many millimeters. If you think about how you control your hand, it doesn’t work that way. You look at a bottle, you don’t know how far away it is. You just start reaching for it. You do successive approximation, and you eventually grab it. Once we have machine-learning models that are powerful enough that we can do this very quickly and learn to control any robot, the kinds of robots that we can build will change a lot. We’ll be able to build much cheaper robots, probably special-purpose robots. I think some of the listeners might be immediately imagining humanoid robots, but that’s not necessarily the way you would build a robot if you could build any robot. It’s the sort of washing machine vs. dishwashing humanoid sort of analogy. As you think long term about what are some of the kinds of tasks that you could do or products you could build, I think one of the last sort of very interesting proprietary sources of knowledge in the world are basically in people’s heads. The information for how you even make an electronic device isn’t necessarily, strictly speaking, in the schematics for that. If you look at what Foxconn workers do, for instance, oftentimes they’re using components that aren’t quite to spec. They’re kind of force-fitting things in there. That’s Polanyi’s Paradox: These things that we do, they’re not written down, so a lot of that knowledge is in people’s heads. If you could design a software platform that can control robots and can work across these domains—starting with piece-picking, but then maybe moving into some sort of simple manufacturing and then eventually into much more-precise manufacturing—then you can become a manufacturing company. I think there’s a wide range of possibilities. We’re probably just going to follow market dynamics and listen to our customers in the beginning.

Kerr: This is the ultimate. You mentioned Polanyi’s Paradox—moving from stuff that you could codify into this tacit knowledge base and sort of the ability of technology to start moving into that area. And you’ve been both an investor in this space now and entrepreneur. Do you anticipate a race to hit all these different places where we have tacit knowledge?

Pridmore: I don’t know. I think I go back and forth on this issue. There are some things that I think… I think one of the things we can expect is we will continue to find… there will be another deep learning, essentially. It won’t be deep learning, it’ll be something else. And that will change, for instance, the value of some of the data that exists in these domains. I think you could see, for instance, a company who accumulates a ton of data in the piece-picking field as having a competitive advantage, and then one day we figure out what it is that monkeys are born with that lets them immediately start climbing trees, and all of a sudden piece-picking becomes easy and then something else becomes difficult. It’s really hard to predict, just because there’s so much left to learn in machine learning. I do think that one way that I try to approach this problem is I just look around at what people are doing. And I think people are super computers. Every person is basically a super computer. Any time you see a person doing something that, for instance, a seven-year-old could do, that’s probably a good sign that that thing will be automated, and that that person will be freed up to do something else. I think the thing that I can be sure of is that you’ll definitely see a race to automate those kinds of tasks, and the people will be freed up to do more-interesting, hopefully, and more-complex tasks.

Kerr: Sort of creates the new work. I guess this kind of leads into our last sort of macro question. People look out to the future, and they’re worried about that there won’t be any jobs, that once we have the AI-powered robots that can do everything, there won’t be any need for us humans to be involved anymore. Do you see that as the future, or is it one that you’re more optimistic that there will always be something else that sits beside it?

Pridmore: I’m optimistic. I think we have a lot of choices to make. We can decide what kind of future we build. We can build a really terrible future. We can build a fantastic future. It’s up to us. I’m an optimist though. I think that we will—and already have—continue to invent extremely interesting things, like podcasts and Instagram and things that we never could’ve even considered to be work until we invented them. At the same time, the universe is huge. There’s so many things. Let’s get to Mars. Let’s get to the moon. Let’s do those kinds of things. I think there’s plenty of work to be done.

Kerr: Yeah, so many opportunities there. All right. Now, let’s bring it down to a more tactical level. For an entrepreneur that is coming into the machine-learning space more broadly, do you have a single biggest piece of advice that you would give to them?

Pridmore: I would say stay focused on your customer and the problem, and don’t be wedded to any particular technique. It’s nice to come in with some priors—whether it’s technical expertise or hypothesis about what technique it will take to make it work—but you should remain flexible, because you’re going to learn a lot about your problem as soon as you get started working on it. You’re going to want to build the simplest system you can that gets the job done. So don’t overkill.

Kerr: What about for an MBA student that’s just graduating?

Pridmore: I would say, probably, don’t start the company right away unless it’s a field that you’ve already been working in. You need to come to the problem with something, either knowledge of the problem, itself, from working in the industry, or knowledge of what you think the solution is. I think getting exposure to the kinds of problems that you’re going to face is a good first step.

Kerr: We appreciate Derik Pridmore, CEO and cofounder of Osaro, for joining us today to talk about artificial intelligence and its intersection with robotics. Thank you, Derik.

Pridmore: Thanks for having me in.

Kerr: And thanks to all of you for listening in.

Saiba mais…

Veja também...

Posts populares