ResourcesBlogs
We Spent a Decade Teaching People to Type. They Were Always Meant to Talk.

We Spent a Decade Teaching People to Type. They Were Always Meant to Talk.

Sahitya Sridhar, COO

Humans have spoken for 100,000 years. Typed for a few thousand. The friction was never the medium, it was the mismatch.

Think about what we asked people to do over the last ten years.

We handed them apps and told them to navigate menus. We gave them search bars and told them to craft queries. We built portals, ticketing systems, chat windows, and knowledge bases, and then we were surprised when engagement was low and satisfaction was mediocre. We optimized the text interface obsessively and called it digital transformation.

We were solving the wrong problem.

The friction was never in the medium. It was in the mismatch. Human beings have communicated through speech for hundreds of thousands of years. We've used text for a few thousand. And yet we spent the last decade assuming that the right interface for everything — getting medical information, talking to a brand, requesting support — was a keyboard and a screen.

That assumption is unraveling fast.

The shift isn't coming. It's already here.

A few numbers worth sitting with.

By 2025, more than 60% of HCPs said they preferred digital engagement over face-to-face interaction with pharma. That gets reported as a win for digital - but read it more carefully. What HCPs actually rejected was the episodic, scheduled, rep-driven model. They didn't want to carve out time. They wanted information on demand, in the moment it was relevant, without coordinating calendars.

That's not a preference for text. That's a preference for voice - they just didn't have a good voice option until recently.

Leading institutions like Memorial Sloan Kettering have now fully restricted sales rep access, leaving the vast majority of their oncologists unreachable through traditional field channels. Nearly 70% of the HCP universe sits in what commercial teams call "white space" - below the threshold where reps are deployed at all. These aren't unimportant physicians. They're just unreachable at human scale.

Voice changes that equation completely.

We Spent a Decade Teaching People to Type. They Were Always Meant to Talk.

Sahitya Sridhar, COO
Jul 5, 2026

Heading

Increase in patient engagement

Heading

Reduction in appointment cancellations

Heading

Improvement in treatment adherence

Think about what we asked people to do over the last ten years.

We handed them apps and told them to navigate menus. We gave them search bars and told them to craft queries. We built portals, ticketing systems, chat windows, and knowledge bases, and then we were surprised when engagement was low and satisfaction was mediocre. We optimized the text interface obsessively and called it digital transformation.

We were solving the wrong problem.

The friction was never in the medium. It was in the mismatch. Human beings have communicated through speech for hundreds of thousands of years. We've used text for a few thousand. And yet we spent the last decade assuming that the right interface for everything — getting medical information, talking to a brand, requesting support — was a keyboard and a screen.

That assumption is unraveling fast.

The shift isn't coming. It's already here.

A few numbers worth sitting with.

By 2025, more than 60% of HCPs said they preferred digital engagement over face-to-face interaction with pharma. That gets reported as a win for digital - but read it more carefully. What HCPs actually rejected was the episodic, scheduled, rep-driven model. They didn't want to carve out time. They wanted information on demand, in the moment it was relevant, without coordinating calendars.

That's not a preference for text. That's a preference for voice - they just didn't have a good voice option until recently.

Leading institutions like Memorial Sloan Kettering have now fully restricted sales rep access, leaving the vast majority of their oncologists unreachable through traditional field channels. Nearly 70% of the HCP universe sits in what commercial teams call "white space" - below the threshold where reps are deployed at all. These aren't unimportant physicians. They're just unreachable at human scale.

Voice changes that equation completely.

What text can't do

Text-based engagement has a ceiling, and we've been bumping against it for years.

The problem isn't that people don't read. It's that reading requires a different kind of attention than speaking. When you want to know whether a dosing regimen changes in renally impaired patients, you don't want to open a portal, log in, navigate to a search field, type a query, parse a PDF, and scroll to page 12. You want to ask the question and get the answer. Verbally. In thirty seconds.

Text also strips context. Tone, urgency, uncertainty, the signals that tell you whether someone actually understood or is just moving on, disappear in a chat transcript. A voice interaction captures all of it. You can hear when someone is hesitant. You can hear when they're satisfied. You can hear when they have a follow-up question they haven't asked yet.

And text doesn't scale the way people think it does. More messages sent does not mean more engagement generated. Inbox fatigue is real. Open rates for pharma email hover in the low teens, with engagement declining even as send volumes increase. The volume keeps going up; the signal keeps going down.

Why voice AI is different from every voice attempt before it

We've had voice technology in pharma for twenty years. IVR systems, automated refill reminders, appointment confirmations. None of it moved the needle because none of it was actually conversational. It was text with audio wrapper, scripted, rigid, incapable of handling anything that deviated from the expected path.

What changed is the underlying model. Modern voice AI doesn't navigate decision trees. It understands intent. It handles ambiguity. It can discuss a clinical question, catch a follow-up, escalate when something requires a human, and log everything — in a single fluid exchange that feels nothing like the IVR hell everyone remembers.

A friend who's been investing in health infrastructure for the last decade described it to me this way: "The old voice systems were phones pretending to be software. The new ones are intelligence with a voice - entirely different category."

That's the right framing. This isn't voice as a delivery mechanism. It's voice as a reasoning interface.

The behavioural unlock

Here's what actually happens when you remove the friction of text.

Engagement rates go up — not incrementally, but structurally. When an HCP can call back at 9pm after a long clinic day and get a real answer to a real clinical question without navigating a portal, they do it. When a patient can ask about their treatment in plain language without translating their question into search syntax, they ask more.

The nature of the interaction changes too. Voice conversations are longer, richer, and more specific than text exchanges. People ask follow-up questions they wouldn't type. They share context they wouldn't enter into a form field. You learn things in a voice interaction that a text channel will never surface.

And perhaps most importantly: the long tail becomes reachable. The community oncologist in rural Ohio who has never spoken to a rep and never will — voice gets there. The patient who is not digitally fluent enough to navigate a health portal but can absolutely hold a conversation — voice gets there. The coverage model stops being defined by human bandwidth and starts being defined by who actually needs to be reached.

What this means for how we build

The organizations that are thinking clearly about this aren't asking "how do we add voice to our existing channels?" They're asking something harder: if voice is the primary interface, what does everything else exist to do?

That's a different design question. It means the intelligence layer has to be built around spoken interaction, not retrofitted from text. It means compliance frameworks need to be designed for real-time conversation, not asynchronous content review. It means the metrics change — not open rates and click-throughs, but depth of engagement, resolution rate, time-to-answer, and breadth of reach.

The companies that get there first won't just run more efficient field operations. They'll have built something their competitors can't replicate quickly: a conversational layer that compounds with every interaction, learns every territory, and gets better the more it's used.

The behavior shift already happened. People want to talk. We're just now building systems good enough to meet them there.

Download icon with a downward arrow pointing to a horizontal line inside a blue circular button.

Thank you! The case study will be emailed to you shortly. Please check your spam and junk folders

Oops! Something went wrong while submitting the form.
Download icon with a downward arrow pointing to a horizontal line inside a blue circular button.

Thank you! The case study will be emailed to you shortly. Please check your spam and junk folders

Oops! Something went wrong while submitting the form.