Design Mind frogcast Ep.60 – Co-evolving with Physical AI

Our Guest: John Robins, Head of Physical AI, frog
Podcast

On this episode of the Design Mind frogcast, we’re exploring what happens when AI moves from the screen into the physical world. To take this leap, we’re joined by John Robins, Head of Physical AI at frog.

John defines physical AI as intelligence that understands the physics of our world. Together, we journey through how real-world experiences could be set to evolve as AI gains the ability to move through space, share our environments and take actions alongside us.

Are we about to see physical AI at scale? And if so, where will we see early adoption? How can we navigate decisions around what to and what not to automate? How can we ensure we don’t lose touch with human aspects, such as emotion and ritual?

Listen to the podcast episode and watch the full video below. You can also find the Design Mind frogcast on Apple Podcasts, Spotify and anywhere you listen to podcasts.

Are you ready to uncover the future of human experiences? Download frog’s latest Futurescape report ‘Artificial Realities.’

Discover recent frog physical AI work: Designing Robots for Human Spaces. And check out Fast Company articles on the project here and here.

Episode Transcript:

Design Mind frogcast

Episode 60: Co-evolving with Physical AI

Guest: John Robins, Head of Physical AI, frog

[00:00:00] Elizabeth Wood: Welcome to the Design Mind frogcast. Each episode, we go behind the scenes to meet the people designing what’s next in the world of products, services and experiences, both here at frog and far, far outside the pond. I’m Elizabeth Wood.

Today on our show, we’re exploring what it really means when AI moves off the screen and into the physical world. To do this, we’re joined by John Robins, Head of Physical AI at frog.

For John, physical AI isn’t just about systems, sensors or autonomy. It starts with how we experience the world—what we notice, what we’re drawn to and how we share meaningful moments. It’s also about the changes that happen when AI can move through space, share our environments and take action alongside us.

Here’s John now.

[00:00:50] John Robins: At a personal level, I’m actually enamored by beauty. I live in Marin. My apartment faces Mount Tam and Corte Madera Creek. And so I’m just like drawn by beauty all around us. I enjoy exploring new countries, you know, just experiencing new cultures and new cuisines. I’m based in San Francisco. I see beauty when I go to a ballet or a musical. And as I was thinking about it, I realized that every time I’m experiencing a beautiful moment, my first instinct is to pull out my phone and capture that moment. I want it to last as long as possible. So this whole idea of creating products and experiences that bring delight and bring joy and create memories for people, of course, leveraging technology is more than a job for me.

Hi, I’m John Robins. I’m heading Physical AI for frog. I have an engineering and business background. I was involved in the creation and launch of one of the world’s first 3D map products, which is basically a digital representation of the world in one in one way, representation of physical AI.

I remember very fondly, at the time, my company president handed me a copy of Don Norman’s Design of Everyday Things, and he was very particular about identifying user needs. And one of the formative experiences I had was taking a paper map (these are the days when we still used paper maps) and sitting down with end users and trying to understand what did they want to see in a digital map. So early on, I had my thinking was shaped around human-centered thinking and thinking about how does technology, not just how does it work, but also how is it used for human beings, and how does it make our lives more meaningful?

Fast forward, many years later, a few years ago, I took a course in AI because of my background in navigation and the advances in autonomous vehicles. That is when I first started getting fascinated by the idea of AI for the physical world. Following that, I have worked in connected technologies, sensor technology and IoT as well for industrial and consumer applications. So throughout my life, I have worked in technologies at the intersection of the physical and digital, and I find that very fascinating, because, as you know, humans are analog, but we seamlessly navigate between the physical and digital world through technology, and so to be able to enable that is a fascinating part of my work.

[00:03:08] Elizabeth Wood: Physical AI is showing up in more conversations. But as an emerging field, it’s often approached with assumptions and, sometimes, misunderstandings. Here John shares what physical AI actually is and why the correct definition matters.

[00:03:21] John Robins: When I think of physical AI, I think of intelligence that understands the physics of the world. It understands force, causality, consequences, we all are familiar with large language models. You know, LLM-based models process text or information. If I type ‘pour me a glass of red wine’ it’s basically a string of words, right? But physical AI understands the weight of the glass, its contents, the friction that is needed to move it safely, so it translates my text into action. Basically, physical AI sees the world through sensors, understands context, reasons about what is happening and then translates that into action.

Physical AI is very multidisciplinary. Is one of those few fields where you need software engineers, hardware engineers, designers, strategists, and it spans the entire innovation funnel, all the way from opportunity definition, ROI framing, through product definition, early concepting to downselecting those concepts and prototyping it to validate whether it works and is the kind of product that we want to build, and then finally building it and scaling it for impact and scale. So yeah, our teams are involved in every stage, and in my role, I’m really fortunate to be able to get, like, a front row seat talking to clients who are ambitious, who want to impact the life of their consumers and users as well as create new business value. So understanding what their aspirations and ambitions are, and translating that into technical requirements and design requirements and bringing it to reality is kind of what is part of my role.

[00:04:54] Elizabeth Wood: This is where AI stops being something we look at and starts becoming something we live with. For John, a lot changes when AI leaves the screen and begins to operate in the physical world.

[00:05:04] John Robins: It means understanding that there are constraints like gravity, and as I mentioned, for things like friction and distance, time, human unpredictability. It also means understanding affordances. Now I’m using a design language here, but what objects allow, what space is constrained, and what do humans expect? The difference between, as I explained earlier, AI that knows what a door is, versus physical AI is knowing how to open the door so it translates intent and text into action. It really is like making sense of the 3D world, perceiving it, reasoning about it and then taking autonomous action.

Sensors is a very broad term for multimodal perception. So it can be voice, it can be recognizing voice or gesture or biomarkers like temperature and all of those things. So it’s multimodal and, you know, computer vision camera-based systems as well, they also fall under the definition of sensors. It’s not just about seeing something, but deriving meaning from it.

Physical AI, of course, includes robotics. It’s one of the major applications of physical AI. But it’s not limited to hardware. There are many digital applications of physical AI as well. I think the key idea is that the substrate can be digital or virtual, but the logic has to be physical, as I mentioned, understanding the physics or the laws of the world.

I can give an example to explain a virtual application or digital application. Nvidia came out with this system called Nvidia Earth-2, which is a digital twin of the planet for simulating climate. So it internalizes the physics of the atmosphere, pressure, heat transfer, fluid dynamics, and it can simulate a typhoon at like two kilometer resolution in seconds. So there is no physical product here, no robot involved, but it’s pure physics, grounded intelligence in a digital world. I mentioned about Fei-Fei Li. She’s one of the pioneers in the space. She says that LLMs understand the language of humans, physical AI must understand the language of nature. I think she puts it really well. So there are many other applications. You can think of, you know, game designers and filmmakers who can now conjure entire worlds, scenes and perspectives, which is less time-consuming and less expensive compared to like the traditional production pipeline. Think about in every manufactured object or constructed space, they’re usually designed, again in virtual 3D before its physical creation, but with spatially intelligent models, we can now quickly visualize structures and walk through spaces and understand how we might want to live and work and gather. But again, the core is those physics informed intelligence layer.

[00:07:38] Elizabeth Wood: What’s taking shape is an intelligence that goes past prediction and perception. It’s AI that understands how the world behaves and why humans move within it the way they do. Here John shares some of the potential use cases for physical AI.

[00:07:50] John Robins: I think it’s a combination of different factors. I think the immediate opportunity is definitely in robotics, is embedding intelligence into physical products that can do things that were not possible in the past. I think the hero moment is going to be when physical AI intersects with human needs. Think about an aging population, maybe a robotic dog that can alleviate loneliness. Or think about a drone system that can, you know, ship supplies to remote areas. Those are all applications where physical AI can make a huge difference. Is there going to be one main application? I don’t know the answer to that yet.

I think it’s about understanding context. Just to give an example, if I were to ask you, what time is it? You might say 12:30 and that’s a correct answer, or you might say it’s time for lunch. So the nuance is understanding the context. You know, it’s not just a string of words. You’re able to translate my question into the intent that I have embedded in my question. Our sister company, Synapse built a contextually aware robot many years ago, which kind of solidifies the point that I want to make. Imagine an Alexa system that we have used in the past. If you were to ask it, turn that light off, Alexa doesn’t know which light to turn off. It just understands your words and your language, but doesn’t understand context. So Synapse built this contextually aware robot, Gerard. If you were to look in a particular direction or point in a particular direction and say ‘turn that light off,’ the robot then senses, it understands gaze and gesture and can translate that into intent and take the right action of turning the right light off. So that’s what we mean by understanding human intent. It requires multimodal sensing of voice, gesture, gaze, movement and sometimes even emotional state. Imagine if a human stops suddenly. Is it because of fear, or did they drop something? The system has to be able to understand those kinds of nuances, and we’re not there yet, but there’s a lot of advances that are happening in that space.

[00:09:44] Elizabeth Wood: As autonomous systems leave controlled environments, they start to meet us where we are. And that changes the design challenge entirely. Here John shares why the human must always be at the center of physical AI innovation.

[00:09:57] John Robins: I think increasingly we will see autonomous systems sharing human spaces. And so we have to ask ourselves, you know, how do we want technology to show up in our space and share our space? What form it takes, what expression it takes, is really very nuanced. My colleague Inna Lobel can speak more about this, but imagine if you’re making a coffee making machine, a robot that can just dispense coffee. Making coffee for us is a ritual. It’s more than just an activity. There’s emotion involved in it. There is a routine involved in it. Maybe you drink coffee with someone special in your life. So considering all those contextual factors of emotion and context and social cues is very important thinking about the form and the expression of the system that you design.

Robotics itself is not new. It has been around for many decades, but there have been advances in technology as well as shifts in how people think about technology which is accelerating the advance of physical AI. Thinking about it first from the perspective of AI, robotics in the past was mostly confined behind fences in constrained environments, and they were scripted to do like a specific task, like a pick and place robot, repetitive task that it keeps doing again and again. But today, foundation models are allowing robots to learn from data, to learn from experience. So instead of programming it to do something almost like human beings, they’re learning from our experiences, which we show them through different kinds of training models, but also through their own experiences of doing something and then learning from it. And that means that today, robots are more adaptive. They’re not just doing one set task. They can adapt to new situations that they have not seen before. So that’s like a big change when you think about it from the AI perspective. The other angle is advances in hardware, and costs of hardware, especially sensors and compute, have dropped significantly. What that means is that physical AI is becoming more accessible. Maybe what cost, let’s say, $100,000 to build in the past, maybe cost a fraction of that today. And so in addition to being confined to, you know, high end industrial applications or automotive applications, physical AI will start entering service-centric contracts or consumer spaces as well. So that’s the second piece, and the third piece is socioeconomic factors as well. When we think about the evolving consumer needs, this push for greater personalization, faster delivery, is really what is accelerating physical AI in industrial settings, but meanwhile, aging populations are driving use cases in elder care and structural labor gaps in retail are accelerating food automation robots as well. So it’s this combination of technological advances in AI, drops in cost of hardware and socioeconomic factors that’s driving physical AI.

[00:12:50] Elizabeth Wood: We’re going to take a short break. When we return, John will discuss the human choices at the center of physical AI—what we hand over to machines, what we hold onto, and how those decisions shape the world we’re building together.

For more on how AI is bending the limits of our physical and digital worlds, download our latest frog report ‘Futurescape: Artificial Realities.’  
Download the ReportCTA Button Arrow

[00:13:32] Elizabeth Wood: Now back to John Robins, Head of Physical AI at frog.

[00:13:35] John Robins: So I think the technology is not in its R&D stage. It’s certainly like in early stages of deployment, maybe not fully ready to scale, because of that, I think we will see earliest adoption of physical AI in industrial settings. What I mean by that is that industrial settings are kind of semi-structured. The workflows are standardized. The space is more predictable and the stakes are low as well. If something fails, it is not catastrophic. Versus, let’s say, if you think about like a home setting, it’s a little bit, or maybe not a little bit, it’s the most unstructured environment, unpredictable space. You have wires hanging around, you have kids running around, and so building robotics for home spaces is going to be tricky. Or think about health care, I think that’s going to be one of the biggest spaces for adoption of physical AI, but it’s a highly regulated industry, you can’t have a system that hallucinates or makes an error. And so, both from a technology standpoint, as well as from a business standpoint, I think the earliest adoption is going to be in the back of the house automation. As I mentioned earlier, there are systemic structural gaps in labor force, and so there’s a clear ROI in the industrial space. And as technology matures, as people’s trust in autonomous systems increases as well, we will see adoption of physical AI in more consumer-facing spaces as well.

All these physical systems are also data collection engines. If you think about whether it’s just a sensor in a equipment, industrial equipment, or whether it’s a robot, it’s actually collecting data about the real world interaction with human beings, automating different activities and tasks. And so it is both an opportunity and also a reason for caution as well. I think the opportunity is in integrating this new layer of data with existing digital data as well, systems like an ERP or a WMS system. If you think about integrating data coming from different streams, then you’re able to create new categories of new meaning and new categories of products, new revenue generation opportunities. And so I think data is going to be a key, and data will be a differentiator and potentially a moat for the company as well, because this is your own proprietary data. It’s not just internet scale data. This is specific to the enterprise or the application or the context in which you’re deploying the system. So data is going to be a key differentiator.

When I talk to clients, one of the things that I want to address is thinking about, what kind of activity do you want to automate as this autonomous system starts sharing human spaces, what should remain deeply human, and what is it that you want to offload to a machine? And number two is also calculating or understanding the business value of automating a task. I was recently talking to a client. Their business goal was to double the output without increasing the number of employees. And, you know, to build a system like that, the upfront CapEx was really high, so it didn’t really make business sense for them just to create one robot for one retail space. So I think we had to ask ourselves, like, what activities are automatable, and is there business value? Like mapping it against those two axes will give us the answer of, what are the things that we want to really automate?

The tasks that are automatable are the ones that are usually repetitive and where the stakes are low, meaning that if something goes wrong, it’s not catastrophic. The upside of automating a task is high because you’re automating a repetitive task and taking away that mundane task away from a human being, and if for some reason, something fails, the downside is very low. So that, for me, is like the criteria of deciding what to automate and what not to automate.

[00:17:09] Elizabeth Wood: Sharing physical space with AI is only the beginning. The deeper shift comes when machines start to share our tasks and our time. During our conversation, John shared his perspective on turning from the functional aspects of technology to its real values.

[00:17:25] John Robins: I think the other important consideration would also be, what is it that humans want to hold, and what is it that they want to fold? What are the things that humans like to continue doing, and what are the things that they don’t want to do? Maybe it’s dangerous, maybe it’s dull and not as meaningful for them. So again, thinking about machines as autonomous systems that will share, not just human spaces, but human tasks and share decision making. We are to consider what are those things that will keep humans at the center of innovation and make life more richer for them than feeling like they’ve been replaced?

When you think about systems today, they have human-like characteristics, which means they have the intelligence to ingest information, reason about it and take action. They have that agency to decide for themselves without needing a human intervention. The human is not programming and telling it what to do. It’s learning from what humans have taught it, but then it can adapt itself to new situations. This is a big shift, because if you think about it, humans have always directed the work of other non-human entities, right? But now machines are becoming intelligent like humans. They can see, they can talk in natural language, they can perceive, break a problem into multiple steps and then take actions. And this is both exciting, but also demands real caution. In a digital system, you know, if a system hallucinates, you know, we all have had experiences where you ask a question and it makes up something and makes it sound like it’s real. But you know, you can use human judgment to understand that there is an error and you can correct it. But in the physical world, if the system hallucinates, so for example, if you ask it to pick a pipe and it pulls a live wire instead, it can be catastrophic. It can harm people, or it can cause damage to property. And so as systems become more and more autonomous, and machines are able to learn a range of new behaviors, it really becomes harder to predict and constrain their behavior. And so building systems with guardrails, the operational limits, and having human in the loop as much as needed is, I believe, the right direction. So physical AI has to be predictable. It has to have legible behavior, where humans are able to understand what the system is doing, and why is it doing that, and whether it is whether they can override it.

[00:19:38] Elizabeth Wood: As the field matures, the question is no longer just can we build it. It’s how—and why—we choose to. Because, ultimately, these systems will shape the world we live in. When it comes to business strategy, John shared why certain principles must apply to designing, building and scaling physical AI.

[00:19:54] John Robins: I live in Silicon Valley, where the guiding principle is, you know, move fast and break things. I believe that when it comes to physical AI, we have to design with purpose and engineer responsibly. I believe that we are not just creating products. We are creating a new world, a new reality, in which we are sharing our lives, our spaces, our tasks, our decisions, with autonomous systems. And so we have to carefully think about the unintended consequences, and also, what kind of world do we want to create for ourselves and our future generations? The opportunity is real, but I don’t think that the answer is robots everywhere. You have to apply robotics and physical AI where they intersect with human needs. And as I mentioned earlier, that could mean a robotic dog that helps reduce loneliness for an aging parent. It could mean robot-assisted disaster evacuation, or, you know, drone delivery of critical supplies. Regardless of what that use case is, and we have to identify where the pain points are, and you know where it makes sense to have robotic systems take on some of these tasks, but the core principle is to design with purpose and engineer responsibly.

I would add that when you think about great physical experience, I would say that technology should never be in our face. It should be invisible. It’s a system that fades into the background and quietly makes life easier for us. We see robots flipping, and sometimes you want to ask, is it cool or is it really creepy? We don’t know. So I believe that you know, great physical experiences are the ones that make humans feel respected, empowered, augmented and are taking on the tasks and things that humans don’t want to do or should not do, instead of managing them or replacing them.

[00:21:31] Elizabeth Wood: Amidst all this complexity, John shares how something simple always comes back into view. Humans are social. We attach meaning. And any intelligence that shares our space will eventually share our emotional landscape too.

[00:21:44] John Robins: I think, you know, human beings are social by nature. We like to bond. We like to connect. Even when you think of, you know, something like ChatGPT or Claude, any LLM based system, humans are forming a kind of relationship with it, a parasocial relationship, so to speak. So when you think about physical AI robotic systems, these are systems that will occupy our space, so we will, over time, maybe form some kind of connection with these systems. So having systems that share our space and take on our tasks while being relatable, and systems that you feel comfortable around and systems that create delight for you, is where the magic lies.

I believe that physical AI needs both designers and engineers. Designers imagine what could be without any constraints, new futures, experiences, systems that could advance human life. I believe engineers bring that to life, bring that to reality. Working with the constraints of what technology can do. I often like to think of them as, you know, the left and right side of our brain. The analytical and the experiential side of humanity that kind of, when, when it comes together, becomes one cohesive unit. I do think that there’s a lot that we can learn from each other. I believe that engineers have to understand and develop fluency in what I would call soft metrics, understanding, delight, trust, meaningful experiences, memorable experiences. And I feel that designers have to respect hard metrics, latency, reliability, safety margins, mean time between failures. So I think we need a shared language between both designers and engineers, because design decisions become system behavior, and system behavior drives human experiences. And so it’s, you know, when we bring those two sides together, of imagining what could be and bringing it to reality, that’s when magic happens.

[00:23:35] Elizabeth Wood: That’s our show. The Design Mind frogcast was brought to you by frog, a leading global creative consultancy that is part of Capgemini Invent.

Check today’s show notes for transcripts and a link to download frog’s Futurescape report.

We really want to thank our guest, John Robins, Head of Physical AI at frog for sharing his insights.

We also want to thank you, dear listener. If you like what you heard, tell your friends. Rate and review to help others find us on Apple Podcasts and Spotify. And be sure to follow us wherever you listen to podcasts. Find lots more to think about from our global frog team at frog.co/designmind. That’s frog.co. Follow frog on X at @frogdesign and @frog_design on Instagram. And if you have any thoughts about the show, we’d love to hear from you. Reach out at frog.co/contact. Thanks for listening. Now go make your mark.

Authors
John Robins
Head of Physical AI
John Robins
John Robins
Head of Physical AI

John focuses on building connected ecosystems where intelligent systems and humans work seamlessly together. With 18+ years in product strategy and innovation, he partners with innovative companies and senior leaders to shape new futures using AI, robotics, computer vision, smart sensing, and geospatial technologies.

He previously led product management at a major technology company and ran an industrial IoT startup. John holds an MBA from DeMontfort University, is pursuing an MS in Computer Science (AI) at Georgia Tech and frequently speaks on Physical AI. He is also a Gartner Product Management Ambassador.

Outside of work, John loves to explore new countries, hit the padel court, and catch live theatre.

Elizabeth Wood
Host, Design Mind frogcast & Editorial Director, frog Global Marketing
Elizabeth Wood
Elizabeth Wood
Host, Design Mind frogcast & Editorial Director, frog Global Marketing

Elizabeth tells design stories for frog. She first joined the New York studio in 2011, working on multidisciplinary teams to design award-winning products and services. Today, Elizabeth works out of the London studio on the global frog marketing team, leading editorial content.

She has written and edited hundreds of articles about design and technology, and has given talks on the role of content in a weird, digital world. Her work has been published in The Content Strategist, UNDO-Ordinary magazine and the book Alone Together: Tales of Sisterhood and Solitude in Latin America (Bogotá International Press).

Previously, Elizabeth was Communications Manager for UN OCHA’s Centre for Humanitarian Data in The Hague. She is a graduate of the Master’s Programme for Creative Writing at Birkbeck College, University of London.

Audio Production bySteven Strange
Cookies settings were saved successfully!