AWS AI Practitioner Certification Notes

A comprehensive guide covering Artificial Intelligence fundamentals, Generative AI concepts, Responsible AI practices, Compliance & Governance, Security & Privacy, and AWS ML Services.

AI Introduction

Artificial Intelligence

Development of intelligent systems that otherwise require human intelligence.

AI Layers

  • Data Layer
  • ML Algorithm Layer
  • Model Layer
  • Application Layer

Machine Learning

Branch of AI that focuses on building methods that allow machines to learn. Make predictions based on the data.

Deep Learning

Branch of ML, which uses neural networks to train the ML models.

Generative AI

Branch of deep learning in which pre-trained ML models understand and generate human-like text.

Hierarchy

AI → ML → Deep Learning (Neural Networks) → GenAI

Core Concepts

Transformer Architecture

Core of GenAI Models. Uses self-attention mechanism to look at all the tokens in a sequence of text at once, and figure out which ones are most relevant to each other.

Neural Networks

Computational models made up of layers of interconnected nodes that process data and learn patterns.

  • Many hidden layers
  • Back-propagation process

Important Neural Networks

GAN: Generative Adversarial Network

Approach in which two neural networks compete against each other.

  • Generator: Creates synthetic data
  • Discriminator: Detects whether data is fake or real

Use Case: Data augmentation

RNN: Recurrent Neural Network

Meant for processing sequential data - where the order of elements matter.

Use Cases: Speech recognition, Video Analysis

CNN: Convolutional Neural Network

  • They process pixel data of an image
  • They learn spatial hierarchies (edges, corners) of features from input images

Use Case: Single Image Analysis

ResNet: Residual Network

Advanced CNN architecture

SVM: Support Vector Machine

Classification and regression algorithm. Separates data points into distinct classes by finding boundary lines called hyperplanes.

WaveNet

Model to generate raw audio waveform. Used in Speech Synthesis. Generate high quality human-like audio.

Types of AI Models

Foundation Model

Large, general-purpose AI models trained on broad data.

Amazon Titan is a high performing FM from AWS itself.

Large Language Models (LLMs)

Specialized in understanding and generating coherent human-like text. Example: GPT

Diffusion Models

Generate images from text prompts.

Examples: Stability.AI, DALL-E

  • Diffusion models synthesize images starting from pure noise and then gradually de-noise
  • Classifier-free guidance: Technique to refine or improve quality of generated images. This is done by controlling how strongly diffusion model follows your text prompt

Generative AI Concepts

Drawbacks of GenAI

I. Hallucinations

Asserting something true which is actually incorrect.

Mitigation:

  • Lower Temperature value to produce more focused output
  • Use RAG to retrieve authentic information, to ground model output with authentic data
  • Amazon Automated Reason Check Feature in Bedrock Guardrails

II. Non-Determinism

Response will differ even for same prompt. Key reason for non-determinism is Temperature Settings of Foundation Model - which controls creativity.

III. Limited Context Windows

The number of tokens an LLM can consider when generating text.

IV. Recency: Cutoff Date

Responses might be outdated

V. Cost Intensive

VI. Data Challenges

  • Scarcity of new data - We can run out of training data. Synthetic data is the solution
  • Proliferation of AI-generated content may lead to degradation of models

Prompt Engineering: Overview

Designing, developing and optimizing prompts to enhance the output of Foundation Models.

Prompt engineering is often the first recommended step for customization of foundation models. It is the most economical solution - lowest cost.

Prominent Use Case: Effective for guiding tone and style of generated output.

Anatomy of a Prompt

Instruction, Context, Input data, Output Indicator

Parameters for Controlling Output

Temperature (0-1): Controls Creativity/Randomness

  • High temperature value means more creative, less predictable response
  • Due to temperature, GenAI models are non-deterministic
  • Lower temperature value to reduce Hallucinations

Top-K

Limits the model to choose/select from top-K number of most probable tokens. It first sorts all possible tokens by probability.

Top-P: Nucleus Sampling (Size of the pool)

Pick next token from that smallest set of tokens whose probability adds up to >= P. High value of P means consider broad pool. Influences the percentage of most likely candidates that model considers for next token.

Prompt Engineering Techniques

I. Zero-Shot Prompting

Just straight instruction without any examples. Model has to rely on its pre-trained knowledge. (Important point)

II. Few-Shot Prompting

Provide few examples of the tasks to the model to guide its output.

III. Chain of Thought Prompting

Breaking down problem step by step - human-like reasoning instead of just guessing the answer.

IV. Negative Prompting

Explicitly instruct the model what not to include in the response.

V. Adversarial Prompting (Important)

Deliberately crafting inputs designed to exploit weaknesses in model's behavior.

Purpose and use case: To test robustness of models.

Risks in Prompt Engineering

I. Model Poisoning

Malicious data inserted into training data of FM.

II. Hijacking and Prompt Injection

Influencing model output by malicious instructions in prompt. It involves untrusted external content. Focus is manipulating model behavior by asking it to ignore previous instructions.

III. Jailbreaking

Attempting to override and restrictions and constraints of AI model. Focus is on bypassing safety controls/restrictions applied at system level in AI model.

IV. Exposure

Model may output sensitive or confidential data: Use Guardrails

Responsible AI

Core Dimensions of Responsible AI

Fairness, Transparency, Explainability, Governance, Controllability, Security & Privacy

Services to Ensure Responsible AI

  • Guardrails in Amazon Bedrock
  • SageMaker Clarity: Detects bias and explainability. Ensures fairness and transparency
  • SageMaker Model Monitor: Quality analysis in production. Detects drifts

Services for Governance

SageMaker ModelCard, ModelDashboard, Role Manager

Interpretability & Explainability

Interpretability

The degree to which human can understand how a model works internally.

Decision trees have high interpretability.

Explainability

The degree to which models decisions/predictions can be explained.

Partial Dependence Plots

Global explanation - shows average effect of a single feature across whole dataset. To show how a single feature affects predicted outcomes of a ML model.

Shapley Values

Provides local explanation of an individual prediction by showing how each feature contributes to a specific model output.

Human Centered Design

Designing systems around human needs, values and context.

I. Design for Amplified Decision Making

AI systems should enhance/augment human decision making capabilities of humans by:

  • Minimizing risks in high pressure environment
  • Bringing more clarity and simplicity

Compliance & Governance

Compliance Challenges

In order to comply with regulatory frameworks, we face following challenges in AI:

I. Complexity and Opacity

It is difficult to audit how AI systems make decisions.

II. Dynamism

ML models continue to change, they are not static.

III. Emergent Capabilities

Unintended capabilities a system may have.

IV. Algorithms Accountability

Algorithms should be transparent and explainable.

Compliance Tools

Model Cards, AI Service Cards: Compliance tools

Governance

A. Governance Framework

  1. Establish Governance Board or Committee
  2. Define roles and responsibilities
  3. Implement Policies and Procedures

B. Governance Strategies

I. Define Policies: About data management and model training.
II. Review Cadence and Strategies:
  • Clear timelines for review
  • Technical reviews, Non-technical reviews
  • Testing and validation procedures
III. Transparency Standards: Publish information and document limitations, capabilities.
IV. Team Training

Data Lineage

Data lineage refers to complete end-to-end journey of data lifecycle from origin to consumption.

From where this data come → how it reached here → What happened along the way

Document

  • Source citation
  • Details of collection process
  • Methods used to clean data
  • Pre-processing and transforming data

Security & Privacy

Techniques for Security of AI Systems

I. Threat Detection

Identification of real-time active threats.

II. Vulnerability Management

Identifying, assessing and mitigating security weaknesses. Patch management and update process.

III. Infrastructure Protection

Access control list, Network segmentation, encryption

Secure Data Engineering: Best Practices

I. Assess Data Quality

Evaluate Completeness, Accuracy, Consistency. Identify biases, errors, inconsistencies.

II. Apply Data Privacy Enhancing Technologies

Data masking, obfuscation, encryption, tokenization.

III. Data Access Control

Only authenticated and authorized users should have access to data.

IV. Data Integrity

Robust data backup, recovery strategy.

Security Scoping Matrix for GenAI

Deals security risks associated with Deployment of GenAI applications.

Order of Ownership (Low - High)

  1. Consumer APP (Low Ownership): You just use third-party APP. At this stage, you do not own model or training data
  2. Using Enterprise App
  3. Pre-trained Models
  4. Fine-tuned Models
  5. Self-Trained Models (High Ownership)

AWS Security Services

I. Amazon Macie

Uses machine learning to identify sensitive data such as personally identifiable information. It identifies patterns that match common type of sensitive information such as credit card numbers.

II. AWS Config

Continuously monitors configuration of your AWS resources over time.

III. Amazon Inspector

Automatic Security assessment of EC2, Container Images and Lambda functions.

IV. AWS Artifact

Provides on-demand access to AWS compliance documentation and AWS agreements.

AWS ML Services

Amazon Rekognition

Computer Vision service.

Detects objects in images.

Content Moderation APIs: Can analyze images and detect inappropriate content written on images.

Amazon Comprehend

NLP Service. Performs sentiment analysis on text. Extract insights from unstructured text data.

Amazon Translate

Automatically translates text between various languages. Batch translation feature allows translating large volume of text.

Amazon Personalize

You provide unique data signals of user activities such as "history, page views, preferences". It then gives personalized recommendations.

Amazon Polly

Text to natural sounding speech.

Amazon EMR: Elastic MapReduce

Big-data platform that enables large scale data processing. Key strength: native support for PySpark. Runs complex feature engineering workflows in a highly distributed manner.

AWS Glue: ETL Service

Extract, Transform, Load operations before feature engineering.

Amazon Augmented AI

Human in the loop service. Allows human reviews to model predictions. Human reviews are triggered under certain conditions.

Amazon Transcribe

Speech to Text Service.

Amazon Textract

OCR - Optical Character Recognition Service. Automatically extracts text and structured data (tables) from documents and images. It cannot process videos.

Summary

This guide covers the fundamental concepts of AI, Generative AI, Responsible AI practices, compliance and governance frameworks, security best practices, and AWS ML services. It serves as a comprehensive reference for AI practitioners working with AWS cloud services.

Last Updated: January 2026 | Notes by Nadir Hussain