IVQ 151-200
Section 16: Data Preparation & Ingestion (10 Questions)
What preprocessing steps are needed before training a GenAI model?
How do you clean noisy textual data for GenAI training?
What is tokenization drift and how do you prevent it?
How do you manage out-of-vocabulary (OOV) tokens?
How would you prepare a custom dataset for fine-tuning GPT?
What is the role of chunking in RAG pipelines?
How do you handle multi-language data ingestion for a GenAI use case?
How do you anonymize personally identifiable data before training?
What are the tradeoffs between training on documents vs. dialogue data?
How do you balance dataset diversity without sacrificing relevance?
Section 17: Long-Term Memory & Context Handling (10 Questions)
How do LLMs handle long context windows, and what are the limits?
What is memory replay in agent frameworks?
How does ReAct differ from simple tool-calling agents?
What is “episodic memory” in LLMs?
How do you store and retrieve long-term memory using vector DBs?
How do you deal with context loss in multi-turn conversations?
What’s the difference between external and internal memory for agents?
How does Claude 2/3 manage longer context better than GPT-4?
What strategies help chunk documents for better summarization?
How do you evaluate memory relevance in GenAI workflows?
Section 18: Open Source & Model Hosting (10 Questions)
Compare Mistral, LLaMA 2, and Falcon models.
How do you host an open-source LLM using Ollama or Text Generation Web UI?
What are the benefits of vLLM for serving LLMs in production?
How does Hugging Face Inference Endpoints work for GenAI?
What is quantization-aware training (QAT)?
How do you deploy LLaMA 2 using Hugging Face Transformers?
What is the role of Triton or ONNX in GenAI inference?
How do you benchmark different GenAI models locally?
How does OpenRouter help route across multiple LLMs?
What are the licensing concerns when using open-source LLMs in commercial apps?
Section 19: Code & API Use Cases (10 Questions)
How do you use OpenAI’s function calling to interact with APIs?
Build a Python script to call GPT-4 for summarizing a PDF.
Write a prompt template to extract structured data from unstructured reviews.
How would you use GenAI to create SQL queries from English prompts?
How do you validate user inputs before passing them to an LLM?
Build a FastAPI endpoint that takes user input and calls a GenAI model.
What’s the best way to batch prompts for OpenAI API to reduce cost?
How can you use GenAI to classify and route customer tickets?
Implement a RAG flow using LangChain and Qdrant.
How do you cache frequent queries in a GenAI-powered web app?
Section 20: Enterprise Architecture & Deployment (10 Questions)
What architecture would you recommend for a GenAI-powered document search system?
How do you secure an LLM API used in internal enterprise tools?
What are the tradeoffs between using managed LLMs and self-hosting?
How do you enforce audit logs and traceability in a GenAI pipeline?
How would you scale an LLM-based email summarizer for 1M users?
What’s the role of message queues (e.g., Kafka, RabbitMQ) in GenAI backends?
How do you integrate GenAI with CI/CD workflows?
What’s a good microservices structure for a GenAI-powered SaaS platform?
How do you perform load testing on GenAI endpoints?
How do you maintain versioning for prompts, models, and embeddings in production?
Last updated