What are large language models?
Large language models are deep learning systems trained to understand and generate language-like sequences. They can summarize, classify, answer questions, write code, extract information, transform documents, and support conversational interfaces.
Prompting
Prompting is the practice of giving an LLM instructions, examples, context, and output requirements. Good prompts are clear, specific, grounded in data, and aligned with the desired format.
Embeddings
Embeddings convert text, images, or other objects into numerical vectors that capture semantic similarity. They are useful for search, clustering, recommendation, deduplication, and retrieval systems.
Retrieval augmented generation
RAG connects an LLM to an external knowledge source. Instead of asking the model to rely only on its internal training, the system retrieves relevant documents and gives them to the model as context.
- Split documents into chunks.
- Create embeddings.
- Store vectors in a database.
- Retrieve relevant chunks for a query.
- Ask the model to answer using retrieved context.
- Evaluate accuracy, citations, and refusal behavior.
Fine-tuning vs RAG
| Method | Best for | Risk |
|---|---|---|
| RAG | Grounding answers in changing documents and knowledge bases. | Poor retrieval can produce weak answers. |
| Fine-tuning | Teaching style, format, domain behavior, or repeated task patterns. | Requires careful data and evaluation. |
Evaluation
Generative AI should be evaluated for accuracy, hallucination risk, completeness, safety, latency, cost, and user satisfaction. For business systems, evaluation must include real examples from the target workflow.
