# SQL Question Answering Tool

## Dev Environment Setup

Create an env in your preferred way. The example here uses pyenv and virtualenv.

```bash
pyenv virtualenv 3.9.21 sql_tool_testing
pyenv local sql_tool_testing
pip install --upgrade pip

pip install -r code-env/python/spec/requirements.github.txt
pip install -r tests/python/unit/requirements.txt

```

## Basic Testing

Ensure that the sql_tool_testing environment is activated.
Run mypy tests using `make mypy`  
Run linting tests with ruff using `make lint`  
Run unit tests with `make unit`  
Run all the previously mentioned tests together with `make tests`

## Basic Manual Testing Setup

Install the plugin from GitHub on the Dataiku instance where you plan to run manual tests. Once you’ve installed from GitHub, if you don’t have any data to test on you can generate some with a python recipe

```python
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
import pandas as pd
import numpy as np

np.random.seed(42)

n_rows = 20
n_cols = 4

data = {
    'Category': np.random.choice(['A', 'B', 'C', 'D'], size=n_rows),  # Categorical data
    'Value': np.random.randint(1, 100, size=n_rows),  # Numerical data
    'Score': np.random.uniform(0, 1, size=n_rows),  # Numerical data (float)
    'Group': np.random.choice(['X', 'Y'], size=n_rows)  # Categorical data
}

df = pd.DataFrame(data)

# Write recipe outputs
some_data = dataiku.Dataset("some_data")
some_data.write_with_schema(df)
```

Otherwise feel free to use your own SQL data to test on.
In your testing project, navigate to the agent tools and create a “SQL query” tool. If you don’t see it on the list at first, try a hard refresh of the page and next a restart of the dataiku instance.
After this, select a SQL connection matching the SQL dataset(s) you want to query and select an LLM connection to use.
Select the dataset or datasets that you would like to include and give a detailed description of the data under “Data context information”. For now, dataset descriptions and column descriptions are not included in the prompt so you have to add everything in “Data context information”. This means column descriptions and data types as well as how to understand the data.
If you chose to use the code snippet above to make the data, then you can use the following text:

> Answer the questions by leveraging the usage of the following dataset.
> This dataset is the results of a test done on a group of people, they completed a test consisting of various questions and got a raw value between 1 and 100 + a score. Each user is part of a group.
> The table has the following columns :
>
> - Category : correspond to the user category inside its group, it can be A, B, C or D
> - Value: The raw value that the user got at the end of the test (between 1 and 100)
> - Score: This is the normalized score, ranging from 0 to 1
> - Group: It's the original group to which belong the user.

Next navigate to quick test. You can add your question in the format below

```json
{
  "input": {
    "question": "how many D in the category column?"
  },
  "context": {}
}
```

Once executed you can check the traces and logs to see what has been executed.

To test within an agent-based configuration navigate to `Models & Agents` on version < 14 of dku or `Agents and GenAI Models` on version => 14 of dku. Create a new agent and select the agent tool instance that was previously created. Add `Additional prompt` and `Additional description` if you wish then navigate to quick test.

```json
{
  "messages": [
    {
      "role": "user",
      "content": "how many D in the category column?"
    }
  ],
  "context": {}
}
```

Once executed you can check the traces and logs to see what has been executed.

Finally, you can use this agent in Agents Connect. When you are setting this up remember to Provide a description for the agent. You can then ask a question and ensure that the agent is used and check logs and traces for correct response. [More information about settings up and agent in Agents Connect can be found here](https://doc.dataiku.com/dss/latest/generative-ai/chat-ui/agent-connect.html#answers-and-agents-configuration)

## Manually Running the Workflows From Local Machine

The integration tests are run on a [test project](https://tests-integration.solutions.dataiku-dss.io/projects/SQLTOOLPLUGININTEGRATIONTESTS/flow/) named `SQLTOOLPLUGININTEGRATIONTESTS` on the tests-integration instance.
This can be run locally using `act`. To do so make sure you have docker installed and create a file called `.secrets` at the root of the directory. It should include the following values.

```
tests_integrations_url=***
tests_integrations_api_key=***
tests_integrations_reader_user_name=***
tests_integrations_reader_user_pwd=***
```

If you're running on Apple Silicon you'll also need to create `.actrc` file at the root of the directory. This will tell act to use the linux/amd64 container architecture locally. Add these values

```
--container-architecture linux/amd64
-P ubuntu-latest=ghcr.io/catthehacker/ubuntu:act-20.04
```

Install act using `brew` or `ibrew` depending on your setup.

```bash
brew install act
```

You can now run all the workflows like this

```
─ act --secret-file ./.secrets
```

or specific files in the following way

```
act --workflows .github/workflows/update_and_integration.yml --secret-file ./.secrets
```

If you find that the intergration test is failing with act you may need to change one line in the workflow `update_and_integration.yml`

`playwright install chromium` >>> `playwright install chromium --with-deps`

## Evaluation script

Use this script to quickly assess the current development version of the tool. For more details and usage instructions, refer to [this wiki](https://design.solutions.dataiku-dss.io/projects/SQL_TOOL_EVALUATION/wiki/3/In-Development%20Benchmark%20Script).

### Installation

As this script runs outside Dataiku, ensure you have a local configuration set up by following these [instructions](https://design.solutions.dataiku-dss.io/projects/WIKI/wiki/64/Local%20dev%20setup%20%F0%9F%92%BB#connecting-to-dataiku-instances-1).

Then, prepare your script's virtual environment:

```bash
cd eval-tool
uv venv .venv
source .venv/bin/activate
uv pip install -r requirements.txt
```

> **_NOTE:_** Ensure you do not mix the environments for your SQL tool and the evaluation script.

You will also need to copy the `dataiku` libs in your env, e.g.:

```bash
ln -s ~/dataiku-dss-14.0.1/python/dataikuapi .venv/lib/python3.9/site-packages/dataikuapi
ln -s ~/dataiku-dss-14.0.1/python/dataiku .venv/lib/python3.9/site-packages/dataiku
```

### Run the script

#### Environment variables

Befure running the script, please ensure that you have these variables set:

| Variable name                    | Description                                          | Example                 |
| -------------------------------- | ---------------------------------------------------- | ----------------------- |
| SQL_TOOL_EVAL_TARGET_PROJECT_KEY | ID of the Dataiku project to access tools & datasest | `SQL_TOOL_EVALUATION`   |
| SQL_TOOL_EVAL_TEST_ID            | ID that will be displayed in the evaluation charts   | `GLOBAL_a4996d1_o3mini` |

> **_NOTE:_**: `SQL_TOOL_EVAL_TEST_ID` should be concise and meaningful, for example: `<SQLTOOLNAME>_<COMMIT/BRANCH/FEATURE>_<LLMVERSION/CONFIG>`

Then, you can run the script:

```
python main.py
```

You will be prompted to select the SQL tools and query subsets to test. Ensure the configuration matches the datasets in the selected Dataiku instance & project.

Once all tests are complete, a `.json` file with the results will be generated. You can review the file for specific details or follow [these steps](https://design.solutions.dataiku-dss.io/projects/SQL_TOOL_EVALUATION/wiki/3/In-Development%20Benchmark%20Script#dataiku-flow-chart-1) for a more visual representation of the insights.

### Golden queries

The golden queries and their results are located in the `eval-tool/resources/golden_results.json` file. This static file is generated by the `eval-tool/golden_queries.py` script. You should not need to update this file unless the queries in `eval-tool/resources/mini_dev_postgresql.py` are modified.

## Troubleshooting

### Mixed context

Because of the way [tools are working inside Dataiku](<(https://design.solutions.dataiku-dss.io/projects/WIKI/wiki/101/Writing%20Agent%20Tools#important-considerations-warning--1)>), please make sure that the class attributes of the tool can be shared across executions without causing any data interference.

> **Note**: you can check [this Slack thread](https://dataiku.slack.com/archives/C0601D916CC/p1748965793144039) and [this one](https://dataiku.slack.com/archives/CGD1MMU3Z/p1756477664318009) for more context.
