Resourcing

Resourcing Overview

Resource	Minimum	Preferred
CPU	4 vCPU	8+ vCPU
RAM	10 GB	16+ GB
Disk	32 GB + ~2.5x indexed data	500 GB for organizations <5000 users

Vespa (the vector database used by Onyx) does not allow writes once disk usage hits 75%.

Local Deployment (Docker)

You can control the resources available to Docker in the Resources section of the Docker Desktop settings menu.

Often old, unused Docker images take up sizeable disk space. To clean up unused images, run docker system prune --all.

Cloud Providers (AWS, GCP, etc.)

For small to mid scale deployments, we recommend deploying Onyx to a single instance in your cloud provider of choice. When evaluating your instance, follow the Preferred resources in the table above.

Provider	Recommended Instance Type
AWS	`m7g.xlarge`
GCP	`e2-standard-4` or `e2-standard-8`
Azure	`D4s_v3`
DigitalOcean	Meet the preferred resources in the table above

Vespa on older CPUs

Vespa requires Haswell (2013) or later CPUs.For older CPUs, use the vespaengine/vespa-generic-intel-x86_64 image in your Docker Compose file. This generic image is slower.For more details, see Vespa CPU Support.

Container-Specific Resourcing

For more efficient scaling, you can dedicate resources to each Onyx container using Kubernetes or AWS EKS. See the Onyx Helm chart values.yaml for our default requests and limits.

Component	CPU	Memory
`api_server`	1	2 Gi
`background`	2	8 Gi
`indexing_model_server`	2	4 Gi
`inference_model_server`	2	4 Gi
`postgres`	2	2 Gi
`vespa`	>= 4	>= 8 Gi
`nginx`	250m (1/4)	128 Mi

The vespa recommendation is the minimum for a production deployment. With 50GB of documents, we recommend at least 10 CPU, 20Gi Memory.

All together, this comes out to a total available node size of at least ~14 CPU and ~30GB of Memory.

How Resource Requirements Scale

The main driver of resource requirements is the number of indexed documents. This primarily affects the index component of Onyx (a Vespa vector database), which is responsible for storing the vectorized documents and handling search requests.

Vespa’s resource requirements scale linearly with the document count.

Based on our experience with large scale deployments, in addition to the previously mentioned minimums, Vespa needs:

~3GB of memory for each additional 1GB of documents
~1 CPU for each additional 2GB of documents

These are our rough estimates. Other factors that may affect resource requirements include:

The embedding model
Whether you have quantization and dimensional reduction enabled

Resourcing Example

For a deployment with 10GB of text content, your index component will need:

CPU: 4 + 10 * 0.5 = 9 cores
Memory: 4 + 10 * 3 = 34GB

If deploying in a single instance, this would be in addition to the base requirements. Overall, that would take us to

= 13 CPU and >= 50GB of memory.

Given these requirements, a m7g.4xlarge or a c5.9xlarge EC2 instance would be appropriate. If deploying with Kubernetes or AWS EKS, this would give a per-component resource allocation of:

Component	CPU	Memory
`api_server`	1	2 Gi
`background`	2	8 Gi
`indexing_model_server`	2	4 Gi
`inference_model_server`	2	4 Gi
`postgres`	2	4 Gi
`vespa`	10	34 Gi

Total available node size: ~20 CPU and ~60GB of Memory.

Next Steps

Guide: Deploy Onyx Locally

Deploy Onyx locally with Docker.

Guide: Deploy on AWS

Deploy Onyx on an EC2 instance.

Getting Started

Local

Cloud

Authentication

Configuration

Miscellaneous

Resourcing Overview

Local Deployment (Docker)

Cloud Providers (AWS, GCP, etc.)

Container-Specific Resourcing

How Resource Requirements Scale

Resourcing Example

Next Steps

Guide: Deploy Onyx Locally

Guide: Deploy on AWS

Getting Started

Local

Cloud

Authentication

Configuration

Miscellaneous

​Resourcing Overview

​Local Deployment (Docker)

​Cloud Providers (AWS, GCP, etc.)

​Container-Specific Resourcing

​How Resource Requirements Scale

​Resourcing Example

​Next Steps

Guide: Deploy Onyx Locally

Guide: Deploy on AWS

Resourcing Overview

Local Deployment (Docker)

Cloud Providers (AWS, GCP, etc.)

Container-Specific Resourcing

How Resource Requirements Scale

Resourcing Example

Next Steps