The ways how we interact with our data systems are changing almost every day.
Think about our way of working with data two years ago…
Who would have thought of having AI assistents helping us to formulate data queries, help us with coding complex data engineering tasks or even allow us to automagically build Power BI reports based on our prompts?
It looks like we have to deal with the everchanging nature of AI getting more and more space in our daily work.
Today, I wanted to give the new Fabric Data Agents a try. According to the documentation, a Fabric Data Agent is defined as follows:
Data agent in Microsoft Fabric is a new Microsoft Fabric feature that allows you to build your own conversational Q&A systems using generative AI. A Fabric data agent makes data insights more accessible and actionable for everyone in your organization. With a Fabric data agent, your team can have conversations, with plain English-language questions, about the data that your organization stored in Fabric OneLake and then receive relevant answers. This way, even people without technical expertise in AI or a deep understanding of the data structure can receive precise and context-rich answers.
Let’s give it a try and build our first Data Agent.
What are the steps needed to create my first Data Agent?
Pre-Requisites
Well, it’s quite easy. We need a paid Fabric capacity (F2 or larger) and some tenant settings turned on (Prerequisites – https://learn.microsoft.com/en-us/fabric/data-science/how-to-create-data-agent#prerequisites)
Data For your Data Agent
Second, we need a data source. In my case I did not wanted to create a sample dataset on my own, so I created a new Warehouse in Fabric using the Sample warehouse option (this creates you a new Warehouse plus several tables containing NY City taxi data).


At the end of the assistent, those tables are created

I also edited the Model Layout to create some relationships between the tables. At the end, this is my data model for my first Data Agent

Let’s create the Data Agent
Back to the workspace and in the New Item dialog, please select the Data agent (preview) option. It’s still in preview – so let’s see what the possiblities already are. Give it a name and an empty Data agent is created.

What we need to do first is to select some data sources for the Data agent. In my case, I will add the newly created Warehouse.

In the OneLake catalog screen, I selected the Warehouse and added that one to my Data agent..

As of today, you can add up to five sources (Fabric lakehouses, warehouses, Power BI semantic models or KQL databases). (1)

The textbox / prompt input (2) allows us to talk to and ask our Data agent some questions.
For fine-tuning, we can add some instructions, prompts and clarifications for the AI (3) – I will not focus on this one, but maybe will write another blog post in the future.
The first try…

Hmm.. this is strange. I added a data source but Data agent does not know about it. What could have been wrong? Well, the Agent needs some more hints; which parts of the data source should be made available to the agent?
By default, none of the tables have been selected. So I selected all (except the Payment view) and tried it again.

And that helped..


And now to the real questions.. 🙂
As the documentation states, Data agents are there for non-technical users that are not required to know the “real” tech details or even SQL to query the data source.

If you want to know more (how Data Agent got its answer), you can expand the “x step completed” section.

Some more example questions..

The question “What is the average tripduration in minutes per year and month?” got answered by Data agent as follows:
- Some remarks here: the output for the data user is sorted by the month numbers but showing month names.
- the query output details contain the month number.

The generated query contains a divide by 60 – but why? because AI helped us to convert the information stored in Trip table, column TripDurationSeconds, to the requested minutes value.



Another interesting one – It’s important to work with the right naming and prompting.

Hmm.. My first intention was to get the average duration for all days, but Data agent used the information stored in the Date table to only analyze weekdays

And now, with a slightly changed prompt we get the average values for every day of the week

Not only the data counts, also the AI helps you..


My Data agent is ready – publish – go
My first version of the Data agent is ready – with the Publish button, I start the publish workflow to share the agent to my co-workers. And also to have a URL ready if we want to use this data agent programmatically.

Each publish action creates a new version of the Data agent. Plus – you’ve got the draft version where we can work and fine-tune our Data agent development.


But how does the question to query to answer works?
In our sample (as I use a Warehouse as source), it’s NL2SQL that “does the magic”. Read the question/instructions, incorporate the data source(s), generate the queries and return the results to the user.

To summarize my first Data agent impressions
I had my first Data agent up and running within a few minutes. After creating the sample warehouse I could (almost) immediately ask questions and Data agent answered them for me.
Some points not to forget (for future-Wolfi)
- Tenant settings need to be enabled
- Not only select data sources but also select specific tables/objects
- Specific prompting is key. In another blog I will show you how to add AI instructions to give the Data agent more knowledge about your data.












































