To create, modify, and delete endpoints for model serving on Databricks, you can follow the instructions provided below.
Creating Model Serving Endpoints
You have two options to create model serving endpoints: using the Databricks Machine Learning API or the Databricks Machine Learning UI.
- API Workflow
To create an endpoint using the API, you can use the following Python code with the requests
library:
import requests |
Notice that:
- The POST request method is used for creating endpoints.
- It’s important to note that the API workflow for creating endpoints only works the first time a model is created. If the underlying model version changes or there are any configuration updates, you need to use the modify endpoint method.
- The access-token is your databricks access token which can be generated follow this blog
- UI Workflow
To create an endpoint using the UI, follow these steps:
- Go to the Databricks sidebar and click on “Serving”.
- Click on “Create serving endpoint”.
- Provide a name for your endpoint.
- In the “Edit configuration” section, select the model from either the Workspace Model Registry or Unity Catalog, along with its version.
- Click “Confirm”.
- Select the compute size for your endpoint and specify if it should scale to zero when not in use.
- Configure the traffic percentage to route to the served model.
- Click “Create serving endpoint”.
Modifying the Compute Configuration of an Endpoint
After enabling an endpoint, you can modify its compute configuration using either the API or the UI.
- API Workflow
To modify the compute configuration of an endpoint using the API, you can use the following Python code:
import requests |
Notice that:
- The PUT request method is used for modifying the compute configuration of an endpoint.
- Use this method when you want to update the compute configuration or change the served models of an existing endpoint such as increase the model version.
- UI Workflow
To modify the compute configuration of an endpoint using the UI, follow these steps:
- Go to the Databricks sidebar and click on “Serving”.
- Select the endpoint you want to modify.
- Click on “Edit configuration”.
- Choose a workload size and specify if the endpoint should scale down to zero when not in use.
- Modify the traffic percentage to route to the served model.
- Click “Save”.
Deleting a Model Serving Endpoint
To disable serving for a model, you can delete the endpoint it’s served on.
- API Workflow
To delete an endpoint using the API, you can use the following Python code:
import requests |
- UI Workflow
To delete an endpoint using the UI, follow these steps:
- Go to the Databricks sidebar and click on “Serving”.
- Select the endpoint you want to delete.
- Click on the kebab menu at the top and choose “Delete”.
These instructions provide a clearer and more concise way to deploy, modify, and delete model serving endpoints on Databricks, using both the API and the UI.