Use these instructions to configure NLWeb to run in Azure - below we have instructions for:
These instructions assume that you have an Azure subscription, the Azure CLI installed locally, and have Python 3.10+ installed locally.
-
Log in to Azure:
az login
-
Create a resource group (if needed) - if you already have a resource group you want to use, skip to the next step:
az group create --name yourResourceGroup --location eastus2
-
Create an App Service Plan:
az appservice plan create --name yourAppServicePlan --resource-group yourResourceGroup --sku P1v3 --is-linux
-
Create a Web App:
az webapp create --resource-group yourResourceGroup --plan yourAppServicePlan --name yourWebAppName --runtime "PYTHON:3.13" -
Configure environment variables; modify the below command to include all of the environment variables in your .env:
az webapp config appsettings set --resource-group yourResourceGroup --name yourWebAppName --settings \ AZURE_VECTOR_SEARCH_ENDPOINT="https://TODO.search.windows.net" \ AZURE_VECTOR_SEARCH_API_KEY="TODO" \ AZURE_OPENAI_ENDPOINT="https://TODO.openai.azure.com/" \ AZURE_OPENAI_API_KEY="TODO" \ WEBSITE_RUN_FROM_PACKAGE=1 \ SCM_DO_BUILD_DURING_DEPLOYMENT=true \ NLWEB_OUTPUT_DIR=/home/data \
-
Set startup command:
az webapp config set --resource-group yourResourceGroup --name yourWebAppName --startup-file "startup.sh"
-
Deploy code using ZIP deployment. Do this from within your cloned NLWeb folder, making sure you have set your preferred providers you will use in the 'config' folder before doing this. If you are not using the 'main' branch, replace this with the branch name to use. NOTE THAT THIS WILL NOT PICK UP LOCAL CHANGES - any preferred providers must be checked into the specified branch.
git archive --format zip --output ./app.zip main
-
Deploy code using ZIP deployment:
az webapp deploy --resource-group yourResourceGroup --name yourWebAppName --type zip --src-path ./app.zip
View logs in Azure Portal or using Azure CLI:
az webapp log tail --name yourWebAppName --resource-group yourResourceGroupAzure App Service provides diagnostic tools in the Azure Portal:
- Go to your Web App
- Navigate to "Diagnose and solve problems"
- Choose from available diagnostics tools
The application includes a health endpoint at /health that returns a JSON response indicating service health.
If you don't have an LLM endpoint already, you can follow these instructions to deploy a new endpoint with Azure OpenAI:
-
Create an Azure OpenAI resource at via the portal. Use these instructions as a guide as needed.
Notes:
- Make sure you select a region where the models you want to use are available. Refer to AOAI Model Summary Table and Region Availability for more info. To use the Azure OAI defaults of 4.1 and 4.1-mini in the config_llm.yaml, we recommend using
eastus2orswedencentral. - If you are calling this endpoint locally, make the endpoint accessible from the internet in the network setup step.
- Make sure you select a region where the models you want to use are available. Refer to AOAI Model Summary Table and Region Availability for more info. To use the Azure OAI defaults of 4.1 and 4.1-mini in the config_llm.yaml, we recommend using
-
Once your AOAI resource is created, you'll need to deploy your models within that resource. This is done from Azure AI Foundry under Deployments. You can see instructions for this at Azure AI Foundry - Deploy a Model.
Notes:
- Make sure the resource you created in the prior step is showing in the dropdown at the top left of the screen.
- You will need to repeat this step 3 times to deploy three base models:
gpt-4.1,gpt-4.1-mini, andtext-embedding-3-small.
-
You'll need to add your Azure OpenAI endpoint and key to your .env file see step 5 of Local setup in the README file. You can find the endpoint API key for the Azure OpenAI resource that you created above in the Azure portal, not Azure AI Foundry where you were deploying the models. Click on the Azure OpenAI resource, and then in the left-hand sidebar under "Resource Management," select "Keys and Endpoint."
If you are seeing a lot of timeout issues, you may need to increase your AOAI rate limits.
-
Go to the Azure AI Foundry Deployments Tab where your deployed models are shown. You will need to repeat these instructions for each of the 3 models/embeddings.
-
Click on the radio dial to the left of the model model. (Alternately, you can click into the model to see a full page of the configuration.)
-
Click 'Edit' in the top of the list of models. ('Edit' is at the top left of the next page if you clicked into the model.)
-
In the popup, scroll to the bottom where it says 'Tokens per Minute Rate Limit' and drag the slider to a higher value.

