There the multiple possible reasons for search performance degradation. The most common ones are:
hnsw_ef, complex filters without payload index)Use when: individual queries take too long regardless of load.
with_payload: false and with_vectors: false to see if payload retrieval is the bottleneckUse when: system can't serve enough queries per second under load.
default_segment_number to 2) Maximizing throughput
Use when: filtered search is significantly slower than unfiltered. Most common SA complaint after memory.
is_tenant=true for primary filtering condition: Tenant index
nested filtering conditions as a primary filter. It might force qdrant to read raw payload values instead of using index.indexed_only=true parameter, if the query is significantly faster, it means that the optimizer is still running and has not yet indexed all segments.optimizer_cpu_budget to reserve more CPU for queriesprevent_unoptimized=true to prevent creating segments with a large amount of unindexed data for searches. Instead, once a segment reaches the so called indexing_threshold, all additional points will be added in ‘deferred state’.Learn more here
always_ram=false on quantization (disk thrashing on every search)