QDrant IVQ

1️⃣ Core Architecture & Data Model

  1. What is the internal structure of a Point in Qdrant?

  2. How does Qdrant store vectors and payload separately?

  3. What is the role of Segment in Qdrant?

  4. What is the difference between a Segment and a Collection?

  5. How does Qdrant handle multi-vector support inside a single point?

  6. What is the WAL (Write Ahead Log) role in Qdrant?

  7. How are deleted points handled internally?

  8. What is soft delete vs hard delete in Qdrant?

  9. How does Qdrant ensure idempotent upserts?

  10. What is the lifecycle of a point from insertion to searchability?


2️⃣ Indexing & Performance Internals

  1. What ANN algorithm does Qdrant use internally?

  2. How does HNSW work in Qdrant?

  3. What are M and ef parameters in HNSW?

  4. What is the difference between ef_construction and ef_search?

  5. How does Qdrant balance recall vs latency?

  6. What happens when index build is still in progress?

  7. How does Qdrant handle re-indexing after bulk insert?

  8. What is the impact of vector dimensionality on memory usage?

  9. How does quantization improve performance?

  10. What are Scalar vs Product Quantization in Qdrant?


3️⃣ Collection Design & Multi-Tenancy

  1. Why are too many collections discouraged in Qdrant?

  2. How does Qdrant achieve isolation without separate collections?

  3. What is group_id and how does it help multi-tenancy?

  4. What is the trade-off between payload filtering vs multiple collections?

  5. How does Qdrant optimize filtered search?

  6. What is payload indexing?

  7. What is keyword vs integer vs full-text payload indexing?

  8. How do filterable indexes reduce candidate set size?

  9. What is the cost of high-cardinality payload fields?

  10. When should you split collections anyway?


4️⃣ Query Execution Path

  1. What happens internally when you call /search?

  2. How does Qdrant combine vector similarity + filter?

  3. Is filtering pre-ANN or post-ANN?

  4. What is candidate generation vs re-ranking?

  5. How does score normalization work?

  6. How is cosine similarity computed internally?

  7. What is exact search mode?

  8. How does Qdrant support hybrid search?

  9. How does Qdrant support range queries?

  10. What is the cost of using nested filters?


5️⃣ Distributed Architecture

  1. How does Qdrant shard data?

  2. What is the role of replicas?

  3. How is consistency maintained?

  4. What consistency levels are supported?

  5. What happens during node failure?

  6. How is leader election handled?

  7. How does Qdrant rebalance shards?

  8. What is write consistency vs read consistency?

  9. How are segments synchronized across replicas?

  10. What is the cost of replication on ingestion throughput?


6️⃣ Storage & Memory Management

  1. What is mmap storage in Qdrant?

  2. When does Qdrant load vectors into RAM?

  3. How does Qdrant optimize memory footprint?

  4. What is the difference between in-memory and on-disk collections?

  5. How does Qdrant handle large datasets (100M+ vectors)?

  6. What is the impact of SSD vs NVMe?

  7. How does compaction work?

  8. What are optimizer thresholds?

  9. How does Qdrant prevent fragmentation?

  10. How does background optimization affect query latency?


7️⃣ RAG-Specific Design Questions

Since you’re benchmarking FAISS, Weaviate, Milvus, Pinecone, etc., these are especially relevant:

  1. How would you design a multi-PDF RAG system using one collection?

  2. Should each document be a collection or a payload field?

  3. How would you implement document-level isolation?

  4. How does Qdrant handle chunk-level vs document-level search?

  5. How would you implement semantic + metadata hybrid retrieval?

  6. How would you implement recency boosting?

  7. How would you support versioned embeddings?

  8. How would you implement re-ranking after ANN?

  9. How would you store embedding model version in payload?

  10. How would you migrate embedding dimension change?


8️⃣ Failure & Edge Cases

  1. What happens if vector dimension mismatches collection config?

  2. What happens if payload schema changes?

  3. What happens during abrupt shutdown?

  4. How does WAL recovery work?

  5. How to handle index corruption?

  6. How to debug low recall?

  7. How to debug high latency?

  8. How to measure recall offline?

  9. How to benchmark Qdrant correctly?

  10. What metrics matter: QPS, P99, recall@k?


9️⃣ Algorithmic / Research-Level Questions

  1. How does HNSW compare to IVF?

  2. Why doesn’t Qdrant use pure IVF like FAISS?

  3. What are graph-based ANN limitations?

  4. How does intrinsic dimensionality affect HNSW?

  5. When does ANN degrade to near-linear scan?

  6. What is curse of dimensionality in vector DB?

  7. How would you design a better ANN algorithm?

  8. How does dynamic graph update affect recall?

  9. What is theoretical complexity of HNSW?

  10. When is brute-force search better?


Last updated