Webinar Recap: Protecting Underwriting Judgment in AI-Enabled Workflows

Most insurance teams are chasing AI. The question is whether they’re chasing the right outcomes.

On March 17, InsurTech NY hosted a webinar on underwriting judgment in AI-enabled workflows, bringing together operators who’ve actually implemented these systems at scale. The panel included Mladen Subasic, Chief Product Officer at BoundAI and OIP Insurtech; Ted from Bridge Specialty, leading digital distribution; and Gabe from Triricura, a company in the medical professional liability space. David moderated.

The conversation wasn’t about what AI could theoretically do. It was about what breaks in production, where implementations get rolled back, and what actually separates efficiency from better outcomes.

If you missed it, here’s what we covered.

The Triage Problem – Beyond Data Extraction

The webinar opened with a poll: Where is your organization on its AI journey?

The results split evenly. About half were in the exploratory phase, while the other half was piloting and experimenting. Only a small portion was running multiple live projects simultaneously.

That tells you most organizations are still figuring out what AI can actually do for them and where it fits.

David kicked things off by asking Mladen about the gap between data extraction and triage. A lot of carriers have invested heavily in extraction, pulling data out of submissions, loss runs, and documents. But extraction alone doesn’t change outcomes.

Mladen was direct: if you stop at extraction, you’re missing the point.

“We are stopping at operational savings and missing out,” he said. “We are working on expense ratios, but we’re not pushing that data forward to the underwriters who are market facing and not allowing them to efficiently underwrite to capture more quality risk.”

Once you’ve captured the data, the real value comes from decision support: triage, risk scoring, portfolio management, and risk insights. That’s where underwriters get time back. That’s where you start affecting quote-to-bind ratios, not just processing speed.

Ted pushed back on the assumption that faster automatically means better.

“If improved triage just helps submissions move through a system more quickly, then all we’ve really done is compress the timeline,” he said. “Same risks get declined. Same ones get quoted. The underlying economics really don’t change all that much.”

The question isn’t whether things are moving faster. It’s whether underwriters and brokers are spending more time on the right risks. If triage can more accurately identify which submissions are viable earlier in the process, that’s when you see real change. Brokers spend less time chasing marginal risks. Underwriters go deeper on fewer, better-aligned opportunities. Quote rates, conversion rates, and binding outcomes improve.

Speed is a byproduct of a smarter system.

What Clean Submissions Actually Look Like

So what does a clean submission look like when you’re underwriting complex risks?

Gabe’s world is medical professional liability, skilled nursing facilities, claims severity, regulatory exposure. These aren’t risks you automate end-to-end.

His funnel is steep. Triricura rejects about 50% of submissions off the bat because they don’t fit underwriting appetite or pricing alignment. Of what makes it through, the quote-to-bind ratio is around 20%.

“Sifting through all of that, it’s super helpful to have technology as part of that process,” Gabe said.

But technology doesn’t make the final call. It focuses attention.

Triricura uses AI for triage and risk scoring, routing submissions to the right underwriters, setting priority, and identifying which accounts need deeper analysis. The goal isn’t to replace underwriting judgment. It’s to make sure human time is spent where it’s needed most.

“We leverage technology to help facilitate a triage and risk scoring process that drives efficiency in our process and focuses the allocation of resources to the highest and best use,” Gabe explained.

The system flags things like technical pricing deltas, account complexity, and underwriting fit. But the ultimate decision (selection, structuring, pricing) still sits with the underwriter.

That’s the difference between automation and decision support. Automation tries to eliminate judgment. Decision support tries to protect it.

Loss Runs – The Persistent Challenge

An audience question shifted the conversation to loss runs, one of the industry’s longest-standing data problems.

Loss runs are critical for experience rating, but they’re a nightmare to work with. Hundreds of thousands of carrier templates, inconsistent formats, and data that needs to be structured, normalized, and mapped to the right level of risk.

Mladen acknowledged the technical challenge but said the tools are finally there.

“Today, with the newer approaches to capturing that information, it becomes much easier,” he explained. “We’ve accomplished a lot with combination of technologies that work on loss runs. It’s robust in the back end, but there’s a lot we can do today to capture that information much faster and with much more robust data sets.”

Once you capture and structure loss run data, it feeds into experience rating, underwriting algorithms, and pricing models. But it depends on how the rating and underwriting processes are set up downstream.

Ted agreed, but added historical context.

“If you look back from InsurTech 1.0 (talking 2015, 2016) to today, it does feel like the technology is where we need it to be to start tackling things like loss runs,” he said.

The challenge wasn’t just extraction. It was doing it at scale across hundreds of capacity providers. Old-school OCR tools worked fine with a single carrier, but they showed their flaws when dealing with the volume and variation in wholesale markets.

“Loss runs is one of those areas where you’ve got to be precise. There’s no way around it,” Ted said. “It does feel like we’re nearing the point where we can actually begin to ingest loss run data a lot more efficiently.”

The technology exists. The question is implementation.

Common Implementation Mistakes

David asked Mladen where implementations go wrong, specifically, where carriers make a major update and then roll it back.

Mladen’s answer: black-box AI.

“There is a lot of depth to be paid to the ways that the underwriting process, marketing strategies, guides, portfolio locations are managed today,” he said. “There’s a lot of technology debt, there’s a lot of process depth, and then there’s this promise of these newer AI technologies which are great on a shallow use case.”

The problem is when AI tries to reinvent the wheel. It looks good in a demo. It works for 30 days. Then it gets rolled back because it doesn’t account for how underwriting teams actually select risks, how broker teams trade, or the nuances already embedded in the company’s expertise.

“Triage is a sophisticated dance of the way that the company sees a good risk, the way that a underwriter wants to see the insight on that specific class,” Mladen explained. “That black-box approach works well for up to 30 days. After that, it has to be rolled back.”

The second mistake: only capturing data on bound risks.

When you ignore declined submissions, you lose critical information that would improve triage models, risk scoring, and appetite alignment down the road. Getting a sharp focus on every risk that hits the underwriting desk (not just the ones that bind) pays dividends later.

“What we are trying to do is be a good tool and a partner for the underwriting teams,” Mladen said.

That’s the principle. AI should work with underwriters, not try to replace them.

Where Underwriting Logic Belongs

An audience question pushed the conversation deeper: Are we just using AI for extraction to execute deterministic rules, or are we actually leveraging AI for qualitative underwriting decisions?

The example given: Can AI tell you there’s no pattern of repeated foodborne illness claims, or that premises liability claims aren’t recurring? Can it make judgment calls on quality, not just check boxes?

Gabe’s answer was clear: Yes, it’s possible. But that doesn’t mean you should let AI make the final call.

“We don’t use generative AI in the pricing inference of our underwriting process,” Gabe said. “The models that we utilize are gated, self-audited, and have strict guidelines on what constitutes intended behavior.”

Triricura uses AI to surface relevant information, aggregate context, and flag qualitative patterns. But the ultimate decision still sits with the human underwriter. The technology provides helpful context.

“We have a heavy amount of human input and transparency in our process to synergize human intuition and decisioning on how selection, structuring, and pricing are established,” Gabe explained.

That’s the line. AI can assess qualitative factors. But in complex risks, it should inform decisions, not make them.

Ted added another layer: explainability.

As a wholesaler, Bridge Specialty sits between retail agents and carriers. They see the variation in what retail wants to bind and what carriers are willing to accept. AI can bring clarity to fragmented appetite signals, turning inconsistent inputs into structured, comparable data.

But there’s a catch.

“If AI is introducing appetite recommendations or decisions that feel more like a black box, you create a new kind of ambiguity,” Ted said. “The answer might be faster, but it’s harder to trust. It’s harder to explain.”

Wholesalers build long-term relationships with retail partners on trust and clarity. If they can’t explain why a risk was declined or quoted on certain terms, the relationship breaks down.

“AI’s got to go beyond accuracy. It has to be explainable. It has to be actionable,” Ted emphasized. “It has to make appetite legible. It has to make brokers and retailers more effective communicators.”

That’s why explainable AI (XAI) matters. In a relationship-driven business, you can’t just give fast answers. You have to give credible answers that people can understand and stand behind.

Build vs Buy

Another audience question: What’s the panel’s perspective on build versus buy, given the advancement of AI coding tools?

Gabe’s framework was simple: universal problems deserve bought solutions. Niche competitive advantages deserve custom builds.

“There are certain problems that we experience, but many other MGAs experience similarly, in which case there’s probably somebody out there that’s built a really high-performance solution,” Gabe said. “Generally, we’re going to make a decision to buy that technology.”

Loss run extraction is a good example. It’s a universal problem. Someone else has already solved it at scale. Buying makes sense.

But when it comes to hyper-specialized areas of expertise, things that differentiate Triricura in the market, that’s where they build.

“Where we see an opportunity to build is where there’s kind of super niche or hyper-specialized areas of expertise that are not necessarily universally applicable to other MGAs, but are super important for us,” Gabe explained.

The decision comes down to two questions: Is this problem universal or specific to our business? And is the value of solving it commercially viable to others, or is it purely a competitive advantage for us?

If it’s universal and someone else has solved it well, buy it. If it’s niche and core to your differentiation, build it.

The Future of Wholesalers and Retail Agents

An audience question asked: If retail agencies adopt AI systems that can consistently assess and pre-qualify risks at a high degree of confidence, will wholesalers just trust what they get and do minimal re-underwriting?

Ted’s answer: Directionally, yes. But it’s not that simple.

“If retail agents are using approved or agreed-upon AI systems to pre-qualify risks in a consistent and credible way, it should reduce a lot of the noise that flows into the channel today,” Ted said. “Submissions will be better formed. There’ll be clear alignment on appetite, fewer dead ends. I think that’s ultimately good for everyone.”

But that doesn’t eliminate the wholesaler’s role.

“The reason wholesalers exist isn’t just to process submissions. It’s to interpret markets, structure deals, navigate complexity across carriers. That layer doesn’t go away just because the front end gets smarter.”

What changes is where the work happens and what kind of work gets done.

If retail does a better job of pre-qualification, wholesalers can spend less time re-checking basics and more time on higher-value activities, shaping the risk, positioning it to the right markets, negotiating competitive terms, solving edge cases.

But there’s a trust dynamic.

“For wholesalers to truly accept retail-generated output, there has to be confidence not just in the technology but in how it’s being used,” Ted explained. “Is it being used consistently? Is it explainable? Is it aligned with how carriers actually think about the risk?”

Until that’s proven, which will take time, there will still be validation in the middle.

“I’d frame it less as elimination of underwriting at the wholesale level and more as a compression of redundant evaluations,” Ted said. “The work doesn’t disappear. It just shifts. And in a business like ours where relationships and accountability really matter, someone’s always going to own the final decision and stand behind it. AI can inform that, but I don’t think it replaces it.”

What’s Actually Working in Production

The final audience question: What are some use cases in production based on generative or agentic AI, and what were the challenges moving from proof of concept to production?

Mladen’s answer came back to the bow tie model.

The most mature use cases are on the left side of the bow tie, collapsing the cost of collecting and ingesting complex data. Statement of values, bordereaux reports, loss runs, anything that’s difficult to collect but provides dividends downstream.

“Complex data collection is now getting the most word of mouth and it stays,” Mladen said. “That’s the technology that actually has a high retention ratio. It has an ROI of six months. It’s very fast and produces a lot of value.”

Once you collapse that cost, you unlock the right side of the bow tie, market-facing use cases that improve quote-to-bind ratios and operational efficiency.

“We start opening up for more use cases which are working on quote-to-bind ratios or working on other market-facing metrics,” Mladen explained. “Quote compares, policy checkings, bind and file audits, market outreach technologies, risk exchanges, even portfolio underwriting.”

These are the second-wave use cases. They depend on having clean, structured data feeding into them. But when they work, they create competitive advantages.

The key is being mindful about what problem you’re solving and applying the right tool to it.

“Most of it can be solved,” Mladen said. “It’s just a matter of choosing the right partner and the right approach and being mindful about which goals we are tackling.”

Conclusion

The webinar made one thing clear: AI isn’t useful because it’s fast. It’s useful when it gives underwriters, brokers, and wholesalers their time back and directs that time toward decisions that actually matter.

The organizations that will win aren’t the ones chasing the most advanced demos. They’re the ones that understand their workflows, choose tools that work with their teams instead of against them, and focus on outcomes.

Speed is a byproduct of a smarter system. Trust is built on explainability. And judgment, human judgment, still sits at the center of underwriting.

If you missed the live webinar, the recording is available. And if you’re ready to explore how BoundAI can help modernize your underwriting operations, reach out to us.