Challenges When Building AI Agent on Top of Your Structured Data
Building AI agents that work with structured data is a complex and multifaceted task. The challenges arise from the nature of the data, the limitations of AI models, and the multi-step processes involved in transforming raw data into actionable insights.
Structured Data in Different Formats and Storage Systems
Structured data often comes in diverse formats and is stored in various systems, which can make it challenging to access and utilize effectively. In many cases, the data can be vast and cumbersome, yet AI models only require specific subsets to address particular use cases. Extracting the right data can be tricky. Direct access to the entire database is typically not possible, especially when dealing with customers' sensitive information.
Further complicating matters is the possibility that the customer's data source may be offline, outdated, or in a legacy system that doesn’t conform to modern standards. In such scenarios, asking customers to fix these issues is neither practical nor feasible. Thus, AI agents need to handle data inconsistencies and limitations without requiring constant manual intervention from the customer.
Data Silos and Complex Joins AI agents often need to access data from multiple sources. Manually joining data from various systems can be complex and error-prone, especially for complex AI tasks.
Maintaining Data Freshness Depends on use cases, the data needs to be updated regularly. In some real-time applications, the data needs to be updated in near real-time, but for the cases that data sources are not always offline, the system still needs to be able to handle the query.
LLM: Prompt Tuning, Workflow Orchestration, and Observability
Large Language Models (LLMs) can be powerful tools when building AI agents on top of structured data, but they require careful tuning and optimization to work effectively. The key to success lies in providing the LLM with the exact right data, which can be challenging given the diversity and complexity of structured datasets.
Prompt Tuning: One of the first challenges when working with LLMs is crafting the right prompts. LLMs rely heavily on the quality and specificity of the prompts they receive. A vague or overly broad prompt can lead to inaccurate or irrelevant responses. Therefore, prompt tuning is critical—developers must iterate on how data is presented to the model, adjusting the phrasing and structure to ensure the LLM interprets the data correctly. This process often involves refining prompts based on feedback and results, a continuous effort to align the model's outputs with the desired outcomes.
Workflow Orchestration: In a real-world application, LLMs don’t operate in isolation—they are part of a broader workflow that involves multiple steps and tools. For instance, one step may involve data preprocessing, while another might involve data analysis or visualization. Orchestrating these steps effectively is crucial to ensure that the model receives the right data at each stage and that the output of one step seamlessly feeds into the next. This orchestration helps maintain data integrity and ensures that the model’s predictions or insights are based on well-processed information.
Observability: To truly succeed with LLMs, developers need visibility into how the model processes data and makes decisions. This is where observability comes in. It’s not enough to rely solely on the model's output; understanding the reasoning behind the model's decisions, its performance at each stage, and how it interacts with the data is essential. Observability tools help track the flow of data through the model, highlight where potential errors might occur, and allow developers to fine-tune the system for better results. The more transparent and observable the model’s operations are, the easier it is to make informed adjustments and optimize the workflow.
By ensuring the LLM is provided with the right data at the right time, and by orchestrating the workflow effectively while maintaining high levels of observability, developers can maximize the success of AI agents built on top of structured data.
Multi-Step Agents and Complex Workflows
Another significant challenge lies in building multi-step agents that require several sequential actions to extract insights. For example, an agent might need to first analyze data in a table, identify relevant information, and then use that information to create a chart. Each of these steps involves different algorithms and interactions with the data, making the overall process more complicated. Orchestrating these steps in a way that ensures the agent performs efficiently and accurately is a considerable hurdle.
Define a semantic layer to bridge the gap
A semantic layer serves as a translation layer between the AI agent and the data sources. Those translations could be as simple as selecting the right columns from a table or as complex as joining data from multiple sources, complex transformations, and aggregations. Sometimes those translations require deep domain knowledge from human experts.
Let's take an example of famous sakila database.Sakila database is a sample database that is used for testing MySQL. It has a set of tables that represent a DVD rental store. The tables include information about customers, staff, films, and payments. You could have semantic tables like:
rental_revenue_by_specific_city
that calculates the revenue generated from rentals in a specific city.customer_lifetime_value
that calculates the lifetime value of a customer based on their rental history.top_customers_by_genre
that identifies the top customers who have rented the most films in a specific genre.- etc.
Each of these semantic tables has well-defined schema and metadata that can be used by the AI agent to understand the data and generate insights.
Each of these semantic tables can only be accessed by specific agents that have the right permissions, for example, an agent that calculates customer lifetime value should not have access to the rental revenue data.
Table Metadata
The following information can be used to help the AI agent understand the data:
- user provided descriptions
- column names and types
- possible values for categorical columns
- primary keys
- whether a column is nullable
- demographics of the data if available
- etc.
Data Freshness
The following freshness options can be provided:
- I don't care about freshness, once data is loaded it can be used forever even if the source data changes.
- I want the data to be refreshed every hour/day/week/month.
- I want the data to be refreshed in real-time.
Data Governance
- User can specify which AI agents can access the specific semantic table.
- User can know which AI agents have accessed the specific semantic table at what time.
Composability
- Semantic tables can be combined to create a virtual database, query can be run on this virtual database.
Build Multi-Step Data Agents with Semantic Tables
Once the semantic table is defined, building multi-step data agents requires the following considerations:
LLM driven workflow orchestration
In our system, we define workflows as a series of actions that are executed in sequence, each action can be a query to one or more semantic tables, calling an external API, or call LLM to generate insights, or run piece of code etc. Workflow serves as a single step in the multi-step agent.
An Plan and Execute Agent
A plan and execute agent can take user query, do plan, call individual agents or workflows step by step, and return the result to the user.