Ryan Tang
BenchSciLead Product Designer2025

Agentic tables and chat differentiation for scientists workflow

Led a prototype exploring how AI could reshape how scientists interact with research data. Discovered an unexpected usage pattern during testing, validated the concept with scientists and commercial partners, and shaped it into a meaningful direction for the platform.

PrototypeAgentic orchestrationCursorClaude CodeInnovation
Roadmapped
Differentiated AI Experience
0+
Research sessions
0%
Unprompted adoption
Differentiation
New AI interaction model

Establishing the interface model in methods hybrid chat-and-table mvp for BenchSci's Experiment Validation value stream, gave us both clarity and resolved our most urgent needs.

However, there was on-going questions around how our experience could differentiate from the many other chats. That lead me to take my observations of the MVP so far, and ask:

What would it mean if the table itself were agentic?

Agentic table created from orchestrating user context. To find out what makes this table agentic read below

The observation

This question emerged while designing the core hybrid interface. During user testing, I kept noticing scientists asking questions about the table.

My hypothesis had been that chat and table would split cleanly along diverge/converge lines: chat for broadening, table for deciding. But that's not what happened.

Scientists follow-up questions were almost always table-specific: filtering rows, adding columns, making comparisons, or asking questions triggered by what they were seeing in the results. Often it wasn't a new overarching questions, but rather a question rooted in the evidence in front of them.

Scientists were diverging from within the evidence.

MENTAL MODEL

Where scientists actually diverge

The assumption

Scientists diverge in chat

Follow-up questions happen after reaching a decision

Query

e.g. TLR4 IHC Protocols

Response

AI directs

Tool

Evidence table

Decision

e.g. protocol conditions

Follow Up

New question

"What primary antibodies would work best?"

The observation

Scientists diverge at the table

Follow-up questions happen before deciding — driven by what they see in the evidence

Query

e.g. TLR4 IHC Protocols

Response

AI directs

Tool

Diverge here ↓

Primary antibodies performance

Follow Up

New question at table

Decision

Primary antibody + conditions

"What secondary antibodies would work best?"

This behaviour showed that there was a mismatch between the mental model I had designed for and the one scientists had as they were engaging. It was not something our initial hypothesis or our chat was shown to support.

I discovered that for scientists, the previously assumed 'evaluation' process was actually 'exploratory'. Scientists don't stop asking questions when they reach the evidence. Instead, because of their knowledge depth, the evidence tended to be where the more impactful questions to their work actually start.

It also presented a unique differentiation opportunity that would further take advantage of our data moat.

The opportunity

Forcing scientists to context-switch to chat every time they wanted to ask a question about the table imposed both a cognitive cost, and presented an expectation gap for a chat didn't work that way. The chat could, at best, run a new query that might return a new table, leaving the challenge of re-anchoring soley on the scientist.

Scientists scanning a table are holding a lot in working memory: which results they've already assessed, which criteria matter most for this particular decision, which rows look promising and why. Shifting the visual anchor (specific data on that table) would make that scientifically rigorous evaluation much more challenging.

The question became: what would it look like if the table itself was agentic — if scientists could extend, filter, and interrogate their evidence without ever leaving it?

OPPORTUNITY

Differentiation opportunity → Agentic Tool

Dynamic
Static
Low User Agency
High User Agency
Agentic Chat
Chatbots
Agentic ToolHMW allow users to have high agency and explore dynamic data?
Legacy Platform
Databases / Ontologies
Search
Papers / Docs
Manual Documentation

Approach: Build to think

I reasoned that this was ultimately a question around the interaction model and agentic orchestration. Neither of which would be well demonstrated or tested in a static or even clickable prototype.

A Figma prototype of an agentic table would have simulated what the interaction looked like. A coded prototype had to actually behave, which meant I'd be testing real LLM response parsing, latency, response variability, and real confusion points rather than an idealized happy path.

Built with v0 (initial UI and scaffolding), Claude (functionality support), Gemini (API for speed and cost), and Cursor (code editting and iteration).Built with v0 (initial UI and scaffolding), Claude (functionality support), Gemini (API for speed and cost), and Cursor (code editting and iteration).

This interaction model was dependent on the quality of the feedback loop between user action and system response, that difference was what ultimately led me to choose building a functional prototype using v0, Claude, and Cursor rather than Figma.

Step 1: Orchestration: Context Bus

Building with AI taught me to plan dataflow, structure, and scalability upfront, and to clearly define what the prototype needed to prove versus what could stay rough.

For this concept, I was testing interaction depth, not visual fidelity, so I deliberately avoided over-aligning to the BenchSci DS to prevent context bloat. I mapped out the user flow, dataflow, and agentic orchestration early, connecting conceptually to our Search APIs or a bespoke MCP, but using synthetic data to keep complexity manageable.

My original scope was just divergence from a table, but I quickly hit a core challenge: how does an agent maintain context of both the table and the user's original intent? This pushed me to anchor the entire orchestration around the scientist's origin query, surfaced from the start of their session.

Since LLMs are probabilistic, reasoning chains drift. A context bus solved this by preserving the original query as persistent context that all downstream agents could reference.

From there, "adding a column" became the interaction to test, it keeps the table as a visual anchor while enabling meaningful exploratory divergence.

Chat and raw data views were considered but ruled out for adding complexity without serving the core hypothesis.

Agentic table architecture

Mini data pipeline — original intent preserved at every step

Pipeline steps

01User Input
02Intent
03Primary Driver
04Supporting
05Data
06Ranking
Memory (Context Bus)
queryintentprimaryDrivercolumnsrowstableDatasummary

Outputs

Same step · parallel
Table Output
Summary
Column & template
Add Column
Templates
processQuery(query)

The scientist's natural-language question enters and is pinned to the bus for every downstream step.

reads: writes: query

From a UX perspective, the context bus is what makes the table feel aware of what the scientist is doing. Staying oriented to their original intent means less re-anchoring, and a freerer exploration.

The tradeoff is that persistent context can be limiting for scientists who genuinely want to pivot mid-session. Ideally, an agent would recognize that shift automatically, but in this prototype, a new search handles it, which is already a familiar mental model.

Step 2: Functional Prototype

Aside from keeping the visual disual deliberately rough, I also decided to under-tune the prompts to allow for greater flexibility of input as I socialized this interaction model.

This meant that non-scientific product, commercial, and engineering partners could input a query from a field they're interested in, and experience the agentic table with relevant information.

The initial query would set up the context bus and create the intial table. The visible pipeline helped to provide users with clear progress to combat the perception of latency.

Adding columns would generate the information based on the row reference and the context bus. This updates the table with the column and in parallel updates the summary.

Viewing a cell generates an insight that is specific to that table and query context. This enables a more personalized experience.

First iteration with scientist query, viewing table, adding column, and then viewing an insight

Initial user feedback

I tested this with 10 scientists and 5 internal non-scientists. Given the new interaction model, lack of polish, and visual feedback, I expected scientists to pause and ask questions. This was intentional so that I could dig into how users thought.

Instead, all of the users used the table efficiently, only stopping with a "Wow!" after seeing what adding a column did.

I had underestimated how much expert users love their tables.

In the prototype, scientists immediately scrolled to orient around the columns, and when adding one, instinctively scrolled right expecting it to appear there. The column's dynamic, flexible nature caught them off guard, but as a delightful surprise.

Adding columns, though unfamiliar, fit naturally into their thinking and workflow.

Commercial partners resonated so strongly that they surfaced a new opportunity: template views that could accelerate demos and help drug discovery teams orient around consistent information. This would be similar to a shared spreadsheet, but more dynamic.

An oncology scientist prioritizing reproducibility across cell lines needs different dimensions surfaced than a neuroscientist focused on reagent specificity.

This led to a template button running queries through a simplified context bus, which also opened the door to latency improvements and pre-caching.

Using a template button runs the query through a simplified context bus, providing predictability to the table view.

Step 3: Re-Converging with Charts

As I learned from the previous methods mvp, evidence was essential to scientist trust.

Tables were an excellent surface for that, but dense. They were good for comparing attributes across rows, poor at revealing patterns across the full result set: distributions, clusters, correlations, outliers.

Scientists were doing that pattern detection manually.

The agentic table helped scientists diverge while staying anchored in context, but it didn't help them converge. To address this, I added chart visualizations matched to that cognitive task, starting with 4 charts for different comparative purposes.

The added visual complexity also required closing the polish gap to keep the overall experience feeling cohesive.

Scientists could pivot the graph view into various chart views to visualize the data more meaningfully.

Filtering

Built on the same data as the table, charts could be filtered — and extended to support account-specific filters.

Filters simplified chart views. Filtered tables could be exported for use in other tools.

Step 4: Table-aware chat

Our platform's primary interface was chat; this prototype's was a table. Chat's utility was clear, but how,or whether, it should interact with the table wasn't.

I wanted to answer that question. This meant prototyping a chat contextually aware of the table and context bus. This lead me to creating a separate context bus for chat.

Agentic chat architecture

Table-aware chat — table state carried as context at every step

Pipeline steps

01Context Capture
02Chat Input
03Classify
04Route
05Execute
06Respond
Memory (Context Bus)
querytableStatechatInputoperationTypefilteredRowsnewColumnresponse

Outputs

Same step · parallel
Updated Table
Chat Reply
Filter vs generate
Filter ↺
Generate ↺
captureTableState(columns, rows, filters)

Snapshots the current table: active columns, visible rows, applied filters into the bus before any chat operation begins.

reads: writes: tableState

I added chat as a table action to keep focus anchored there, intentionally limiting scope but still showing how it could affect the table.

Table-aware chat that filters table on a suggested criteria

The original chat directed scientists to evidence. Table-aware chat extended that to help them work within it.

Validation

I tested the prototype with my team, scientists, and commercial stakeholders. The clearest signal came from the column-add mechanic where scientists immediately reached for dimensions specific to their work, despite knowing the table used synthetic data.

They were mentally modeling it as a real research tool, despite it being a prototype!

Commercial stakeholders noted the UX differentiation from competing chat products, and the combination of templates and an agentic table also addressed a service design gap in demos and onboarding, ultimately hintsing at an opportunity for a more bespoke enterprise UI model altogether.

Result

What started as an exploration solidified into a coherent paradigm for evidence-first agentic interfaces. With the interaction model and differentiation case validated, it's now on the platform roadmap as an active exploration as the data model matures.

Reflection

Building in code was the most impactful decision I made. A Figma prototype simulates behavior. A coded prototype, however imperfect, actually lets you test the interaction model.

AI coding tools compressed that loop further, letting me follow my instincts quickly. Though every iteration was ultimately shaped by watching what scientists tried to do.

If I did this again, I'd bring commercial stakeholders in earlier to better understand the service challenges alongside the user ones.

The bigger lesson was learning that designing AI products is only partly about the interface.

The dataflow, pipeline, schema, and agentic architecture are equally essential design decisions, about how the AI should behave, what it should know, and what it should communicate about itself.