Harnessing Unstructured Data with BigQuery Graph and Kineviz GraphXR

Harnessing Unstructured Data with BigQuery Graph and Kineviz GraphXR

Over 80% of enterprise data exists in unstructured formats like PDFs, emails, and reports, often containing vital business information that is hard to access. The combination of BigQuery Graph and Kineviz GraphXR empowers decision-makers by providing a streamlined workflow to uncover hidden insights.

This integration allows for effective retrieval-augmented generation (RAG) and vector search, which are now standard practices for handling unstructured data. By complementing RAG with graph technology, organizations can enhance trend analysis, entity comparisons, and multi-hop reasoning, while ensuring that insights are verifiable and traceable.

Streamlining Data Management with BigQuery

Traditional analytics pipelines for unstructured data can be cumbersome, requiring multiple systems for storage, parsing, extraction, and analysis. BigQuery simplifies this by consolidating these processes into a single platform. Raw documents are stored in Google Cloud Storage, with text extraction and graph creation handled directly within BigQuery, eliminating data movement and potential synchronization issues.

Transforming SEC Filings into Knowledge Graphs

A practical application of this pipeline involved analyzing SEC 10-K filings from Fortune 500 companies between 2020 and 2024. Each filing, typically around 100 pages, was processed through a four-step approach:

  1. Ingest and parse: Retrieve filings from SEC EDGAR and load them into BigQuery.
  2. Focus on key sections: Extract relevant sections related to market activities, risks, and competitors.
  3. AI extraction: Use Gemini 3 Pro to process sections into structured JSON, detailing competitors and risks.
  4. Graph declaration: Map the structured data into a traversable graph using a single Data Definition Language (DDL) statement.

This process yielded 87,000 entities and over 20,000 competitor mentions, which were consolidated into approximately 8,100 distinct competitors, effectively creating a knowledge graph from unstructured filings.

Interactive Analysis with Kineviz GraphXR

Kineviz GraphXR connects seamlessly with BigQuery Graph, allowing analysts to explore and analyze data interactively. Users can navigate relationships visually and create low-code workflows without needing to write complex queries. This accessibility enables teams to refine their analyses directly.

GraphXR’s AI-assisted features let users perform natural language queries, such as tracking Apple’s competitive trajectory over time, generating dynamic dashboards that reflect real-time changes in the graph.

Ensuring Traceability and Auditability

Every node within the graph is linked back to its source in the original SEC filings, allowing analysts to validate insights in context. For example, selecting a risk entity provides a direct link to the relevant section of the original document.

Key Benefits of the Integrated Solution

  • Simplicity: Fewer systems and data copies streamline the workflow.
  • Scalability: BigQuery can manage vast amounts of documents and facts without custom infrastructure.
  • Explainability: Insights can be traced back to their source with ease.
  • Flexibility: The schema can be extended to accommodate new questions or entity types.

This integrated approach transforms the way organizations interact with unstructured data, making it easier to unlock valuable insights and drive informed decision-making.

This editorial summary reflects Google and other public reporting on Harnessing Unstructured Data with BigQuery Graph and Kineviz GraphXR.

Reviewed by WTGuru editorial team.