This section informs you about all the requirements needed to use this solution.

# Instance Requirements
This solution is only compatible on instances with  **Dataiku V14.2**

## Code Environment
Project python recipes uses the code env **solution_clinical-site-intel**, with **Python 3.9**

Required packages for this code env are: 
>Flask==3.0.2
scikit-learn==1.4.0
faiss-cpu==1.7.0
python-dotenv==0.19.0
dataiku-api-client>=11.0.0
tqdm==4.66.1
transformers==4.24.0
torch==2.0.1+cu117
  --find-links https://download.pytorch.org/whl/torch_stable.html
  sentencepiece==0.2.0
  git+https://github.com/dataiku/solutions-contrib.git#egg=webaiku&subdirectory=bs-infra


The code environment also requires an initiation script. Users should put the following script in the tab **Resources**. 

```

## Base imports
from dataiku.code_env_resources import clear_all_env_vars
from dataiku.code_env_resources import set_env_path
from dataiku.code_env_resources import set_env_var
import os 
from dataiku.code_env_resources import grant_permissions
# Clears all environment variables defined by previously run script
clear_all_env_vars()

## Hugging Face
# Set HuggingFace cache directory
set_env_path("HF_HOME", "huggingface")

hf_home_dir = os.getenv("HF_HOME")

# Load model directly
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("DataikuNLP/paraphrase-multilingual-MiniLM-L12-v2")
model = AutoModel.from_pretrained("DataikuNLP/paraphrase-multilingual-MiniLM-L12-v2")

tokenizer2 = AutoTokenizer.from_pretrained("emilyalsentzer/Bio_ClinicalBERT")
model2 = AutoModel.from_pretrained("emilyalsentzer/Bio_ClinicalBERT")


grant_permissions(hf_home_dir)

```

## Containerized Execution
Container configuration is required. 
![Screenshot 2024-04-25 at 09.58.25.png](wkkZ7puRVCk3)

Please also include **GPU support for Torch 2**  for runtime if executing on Dataiku Cloud.
![Screenshot 2024-04-16 at 13.54.01.png](O3EXl118DVh7)


## Plug-ins
[Reserse Geocoding/ Admin maps](https://www.dataiku.com/product/plugins/geoadmin/) is required.