AWS Generative AI Certification Notes
A comprehensive guide covering Amazon Bedrock customization, RAG, CloudWatch integration, pricing, and Amazon Q services.
Amazon Bedrock Introduction
Main AWS service that does Generative-AI work.
Core of Bedrock
- Selection of Foundation Model, first step
Bedrock Customization
Fine Tuning in Amazon Bedrock (Big part of Exam)
Process of adapting a pre-trained general purpose foundation model to a more specific dataset.
Labelled data is used in fine-tuning.
Pre-requisites Steps in Fine Tuning
Important Use Cases of Fine-Tuning
- Curated labelled dataset is best option for refining models to specific terms, jargons of a particular field (medical, law)
- Fine tuning is best when model is missing user intent or crucial information.
Other Customization Options
B. Continue Pre-training (Domain-adaptation)
Provide unlabelled data to continue training of foundation model. To accommodate new knowledge. To increase knowledge base.
C. Transfer Learning
Adapting a pre-trained model to a new related task. Fine-tuning is actually a specific kind of transfer learning.
D. Distillation
Generate data from large model (teacher) to train a smaller model (student)
Evaluating Foundation Models
A. Human Evaluation
Humans are going to compare responses and grade score. (AWS Managed work-team or your own work-team). Can evaluate responses of up-to 2 models.
B. Automatic Evaluation
Built-in task types (text generation, text summarization, Q/A). Judge Model is going to compare generated responses with benchmark answers.
Metrics for Automatic Evaluation
A. ROUGE: (Recall-Oriented Understudy for Gist-ing Evaluation)
Designed to evaluate text summaries by comparing them with human created reference text.
- ROUGE-N: Measures the number of matching n-grams between generated text and reference text.
- ROUGE-L: Calculates longest common subsequence between reference text and generated text.
- ROUGE-L-SUM: Variant of ROUGE-L which accounts for word order in the summaries.
B. BLEU: Bilingual Evaluation Understudy
Evaluates machine translations. It penalizes for too much brevity.
C. BERTScore (Bidirectional Encoder Representations from Transformers) - AI option
Compares semantic similarity/contextualized embeddings between generated text and reference text
Use Case: Best for evaluating chatbot responses
Business Metrics for Evaluating a FM
- Customer satisfaction
- Average revenue per user
NOTE: Results of evaluation in Amazon Bedrock are stored to S3
PartyRock
- No-code Generative AI application building playground powered by Amazon Bedrock
- Simply describe the functionality without any code, Gen-AI application will be created.
- It does not require AWS account
Use Case:
- Ideal for rapid prototyping.
- Testing and experimenting with Foundation Models
RAG (Retrieval-Augmented Generation)
Introduction
- Referencing external data source outside of training data of a Foundation Model.
- Prompt to the foundation model is augmented with data retrieved from external Knowledge Base.
Knowledge Base and Vector Databases
Knowledge Base is backed by Vector databases.
RAG Process
Large data is chunked into meaningful pieces and passed into embedding models such as Amazon Titan. Embeddings from these models are then stored into vector databases.
Vector Databases
Vector databases store embeddings. They allow search based on semantic similarity.
Embeddings
Numerical representation of tokens. They allow to capture semantic properties such as sentiment.
Examples of Vector Databases
I. OpenSearch Service: Default vector store supported by Knowledge Bases in Amazon Bedrock.
II. Amazon Neptune Analytics: Graph database that enables high performance graph analytics.
III. Amazon DynamoDB
IV. Amazon Aurora: Relational database invented by AWS itself.
V. Amazon S3: Cost effective and durable.
RAG: Input Data Sources
External input data that is actually chunked into pieces may come from following sources:
- Amazon S3
- Confluence
- Sharepoint
- Salesforce
- Webpages
RAG: Use Cases
Use Cases:
- RAG is helpful when real-time data is needed to be fed to Foundation model.
- Reducing hallucinations by grounding model response to authentic sources
Examples:
- Customer service chatbot
- Legal research and analysis
- Healthcare question answering
Bedrock CloudWatch Integration & Pricing
Bedrock Integration with CloudWatch
A. Model Invocation Logging
Collects metadata, requests and responses for all model invocations in your account. Destination of Logs: S3 or CloudWatch logs or both
Note: This is region level setting, does not apply to Knowledge Bases.
B. CloudWatch Metrics
Publish metric from bedrock to CloudWatch.
contentFilteredCount is a metric which helps to see whether Guardrails are functioning.
Bedrock Pricing Models
A. On-Demand
Applies only to Base Foundation Models. Charged based on usage (tokens processed).
B. Batch Pricing
Discount up-to 50%. Suitable when multiple predictions are made at a time.
C. Provisioned Throughput
Mandatory for customized or fine-tuned models. It reserves capacity for certain period of time.
Cost Order of Model Improvement Techniques
Prompt Engineering (lowest cost) → RAG → Instruction-based fine tuning → Domain Adaptation fine-tuning (Most expensive Customization)
GuardRails in Bedrock
Application safeguards to filter undesired or harmful content.
Important GuardRails
- Content Filters: Filter out content from prompts/responses.
- Denied Topics: Refrain from these topics
- Contextual Grounding Check: To detect and reduce hallucinations. Remove data that is not supported by the given source material.
Harmful Categories
Hate, Violence, Sexual, Insult, Misconduct
Bedrock Agents
Manage and carry out multi-step tasks related to infrastructure provisioning, application deployment. Agents are configured to perform specific pre-defined action groups.
Integrate with other systems, services, databases and API to exchange data.
Amazon Q
Amazon Q is all about your internal company data.
Amazon Q Business
Gen-AI assistant designed to help employees find information, generate content and automate tasks based on organization internal data. Similar to ChatGPT but for your company private data.
Admin Controls in Amazon Q
Control responses by blocking specific words or topics.
Similar to Guardrails in Amazon Bedrock.
Amazon Q Apps
Create GenAI apps based on your company data by using natural language without any coding.
Amazon Q Developer
Helps developers build faster by reducing time spent on software development problems.
Generates code samples, tracking references and ensuring compliance with open-source licensing.
Last Updated: January 2026 | Notes by Nadir Hussain