Building Event-Driven Data Agents with BigQuery, Pub/Sub, and ADK

The Need for Real-Time Autonomous Agents

Data is only as valuable as your ability to act on it. In the modern enterprise, reacting to events hours—or even minutes—after they occur is often too late. Whether you're dealing with financial fraud or dynamic supply chain disruptions, every second counts.

But a lot of systems still rely on slow scheduled batch jobs or fragile microservices that constantly pull for changes. By the time a problem surfaces, it's often too late. That leaves human investigators scrambling to piece things together by digging through logs and database queries. It's a slow, painful process that just doesn't scale.

Enter Event-Driven Data Agents

What if, instead of waiting for slow pipelines and manual triage, your data platform could instantly push an alert as soon as an anomaly is detected, triggering an autonomous AI agent to investigate and resolve it?

This is the promise of the Event-Driven Data Agent architecture. By combining BigQuery continuous queries, Pub/Sub, and the ADK Agents on Vertex AI Agent Engine, you can build a pipeline that triages events in real time and autonomously investigates them. The agent uses advanced reasoning to gather context, analyze the data, and either resolve the issue on the spot or escalate it to a person when human-in-the-loop intervention is needed.

The Hybrid Architecture: How it Works

This event-driven pipeline leverages three core building blocks:

Detection: BigQuery continuous queries monitor live data streams and detect anomalies using a rules-based engine.
Routing: Pub/Sub reliably delivers these events, using Single Message Transforms (SMTs) to reshape the payloads into the exact format your AI agents expect, thereby triggering the agentic pipeline to start its investigation.
Resolution: A Vertex AI Agent (built with ADK) receives the event, investigates using custom tools, and logs its decision.

Let’s dive in and explore each component. To make this concrete, we'll walk through a simple use case: detecting and investigating fraudulent financial transactions in real-time.

Part 1: BigQuery Continuous Queries

BigQuery continuous queries allow you to build real-time event streams natively using standard SQL. They are persistent SQL queries that run continuously, analyzing incoming data and immediately exporting SQL results to destinations like Pub/Sub.

The shift from pulling to pushing streaming events natively in BigQuery means you can detect complex anomalies (like a user transacting in two different countries within a user specified window) within your data warehouse using standard SQL. There’s no need to move your data to a separate streaming analytics engine.

This transformation is powered by the launch of BigQuery continuous query stateful data processing in public preview, which introduces native support for stream-to-stream JOINs, windowed aggregations, and tumbling windows. By allowing you to correlate disparate data streams and calculate complex metrics—such as rolling averages or sum totals—directly in BigQuery, we are democratizing stream processing for any SQL user. This eliminates the need for specialized external tools or deep data science expertise to build a real-time 'System of Action' that detects and reacts to events as they happen. This approach also helps manage LLM token costs; by using stateful SQL to filter for specific anomalies, you ensure that your agents only process the exact context they need, rather than overwhelming them with raw data.

Implementing this is straightforward. By combining a standard SQL query with an EXPORT DATA statement, you can route matching rows directly into a Pub/Sub topic the second they occur:

code_block 1 AS is_impossible_travel,\r\n LOGICAL_OR(NOT is_trusted_device) AS has_security_mismatch\r\n ) AS logic_signals\r\n )) AS data\r\n FROM TUMBLE(TABLE TransactionHeuristics, "bq_changed_ts", INTERVAL 2 MINUTE)\r\n GROUP BY window_start, window_end, user_id\r\n HAVING APPROX_COUNT_DISTINCT(location_country) > 1\r\n);'), ('language', 'lang-sql'), ('caption', )])]>

Part 2: Pub/Sub & Single Message Transforms (SMT)

Bridging the schema gap with Pub/Sub. The exported event data from our continuous query is sent directly to a Pub/Sub topic. Before this raw data can be consumed by our AI agent, the payload needs to be transformed to match the schema expected by our agent.

Instead of deploying something like a dedicated Cloud Function to reformat these messages, you can handle it entirely within the Pub/Sub subscription using a Single Message Transform (SMT). SMTs allow you to run lightweight, inline JavaScript User-Defined Functions (UDFs) directly within Pub/Sub to map, reshape, or clean the payload on the fly.

For instance, you can define a transform.yaml with a Javascript snippet that intercepts the BigQuery payload and extracts the exact query format our Agent Engine expects:

code_block )])]>

To configure the routing pipeline, you create a Pub/Sub Push Subscription. This subscription automatically pushes every transformed BigQuery event directly to your AI agent's webhook endpoint:

code_block )])]>

Notice the push-endpoint parameter above. This webhook URL is generated by our final architectural piece: the AI Agent itself.

Part 3: ADK and Vertex AI Agent Engine

When an agent is deployed to Vertex AI Agent Engine, the platform automatically provisions a secure streamQuery endpoint specifically designed to receive these incoming events.

This is the brain of the operation. Once an anomaly is detected and routed via Pub/Sub, the message triggers an ADK agent deployed on Vertex AI.

To implement the reasoning loop, you define your agent, equipped with tools, and deploy it:

code_block )])]>

Equipped with specific instructions and this custom toolset, the agent autonomously investigates the alert by actively gathering external context. It can query BigQuery for a user’s transaction history, analyze unstructured data like receipts, or ground its findings with Google Search to verify a merchant's reputation. Ultimately, it categorizes the transaction as a FALSE_POSITIVE or flags it as ESCALATION_NEEDED.

The Human-in-the-Loop Advantage

This approach is central to the architecture's scalability. By effectively filtering out the noise, it dramatically reduces operational overhead and ensures that your investigators only spend their time on the most complex cases. And since ADK offers an impressive array of tools and integrations, you can have your agent escalate events to a wide array of enterprise systems for both human-in-the-loop engagement, or even automate pipelines end-to-end using human-on-the-loop observability.

Bringing it All Together: Agent Analytics

Once your pipeline is live, the work shifts from building to monitoring. Unlike traditional software, autonomous agents run persistently in the background. Because they operate behind the scenes, having deep observability into what they are doing, how long they take, and how much they cost is critical.

By initializing the BigQuery Agent Analytics plugin during deployment, the ADK automatically logs all trace data, tool usage, and execution latency directly into BigQuery:

By joining this trace data with the structured decisions output by your agent, you unlock rich analytics. This enables you to build dynamic dashboards and set up custom alerts to monitor your AI workforce in real-time. You can check out this Codelab to learn more about using and the Agent Analytics Plugin.

Conclusion

The convergence of real-time data streaming and Agentic AI is changing how we handle operational alerts.

Detect in real-time with BigQuery continuous queries.
Transform and Route with Pub/Sub SMTs.
Investigate and Resolve with Vertex AI Agent Engine.
Analyze with BigQuery Agent Analytics Plugin

This architecture enables you to build a proactive, autonomous workforce capable of handling anomalies the moment they occur—all within a governed, scalable, and serverless Google Cloud environment.