Pages

AWS Landscape 2: Environments

A well-architected SAP landscape in AWS requires multiple isolated environments to ensure security, compliance, and operational efficiency. Separating workloads into different accounts allows:
  • Environment Isolation: Prevents accidental access between Production and non-Production environments.
  • Centralized Services: Shared services like Active Directory, License Servers, and Secrets Manager can be centrally managed.
  • Centralized Logging & Auditing: Logs from all environments can be aggregated for compliance and security monitoring.
  • Cost Tracking & Governance: Each account can be monitored independently for usage and billing.
  • Scalability: New environments, such as Disaster Recovery or Training, can be added without disrupting existing workloads.
Objective:
Define and implement SAP landscape environments in AWS to support deployment, testing, and production workloads. Each environment should reside in a separate AWS account to ensure:
  • Isolation and security
  • Clear billing and cost tracking
  • Governance and compliance
  • Scalability for future environments like Training or Disaster Recovery
Environment Overview
EnvironmentPurposeAWS Account Mapping
ProductionLive SAP workloads, mission-critical systemsProduction Account
DevelopmentDev work, feature development, testingDevelopment Account
Pre-ProductionStaging for pre-release validationPre-Production Account
Quality (QA)Integration, functional, regression testingQuality Account
SandboxAd-hoc experiments, PoC, learningSandbox Account

Key Principles:
  • Each environment has its own AWS account.
  • Accounts are fully isolated for networking, security, and billing.
  • Tagging ensures automation, reporting, and governance.
Technical Implementation Steps:

Step 1: Account Tagging
Purpose: Identify environment type, track costs, and enforce governance.
Action Items:
  • Log in to each AWS account.
  • Navigate to AWS Organizations → Accounts → Tags → Add tag.
Apply the following recommended tags:
Tag KeyExample ValuePurpose
EnvironmentProduction / Development / QA / Sandbox / Pre-ProdIdentify environment for automation & policies
ProjectSAP Migration ProjectTrack project-related costs
Owner<Team Name>Define responsible team

Best Practices:
  • Use AWS Tag Policies to enforce consistent tags across accounts.
  • Ensure new accounts inherit required tags automatically.
Step 2: VPC and Network Setup per Environment
Purpose: Isolate network traffic and prevent overlap between environments.
Action Items:
  • Create one VPC per environment account.
Assign non-overlapping CIDR blocks:
EnvironmentVPC CIDRExample VPC Name
Production10.0.0.0/16VPC-Prod
Development10.1.0.0/16VPC-Dev
Pre-Production10.2.0.0/16VPC-PreProd
Quality (QA)10.3.0.0/16VPC-QA
Sandbox10.4.0.0/16VPC-Sandbox

Subnets and Routing:
Public subnets: Bastion hosts, NAT Gateways
Private subnets: SAP application servers, databases, shared services
Spread subnets across 2–3 Availability Zones (AZs) for high availability

Route tables:
  • Public → Internet Gateway (IGW)
  • Private → NAT Gateway (one per AZ)
Attach Transit Gateway (TGW) in the Network Account for centralized inter-VPC communication

Step 3: Security and Access Control
Purpose: Enforce least privilege and protect production workloads.
Action Items:
IAM Roles & Policies:
  • Production: Only approved administrators have access
  • Dev/QA/Sandbox: Developers may have more access, restricted via IAM policies
AWS SSO / Identity Management:
  • Centralized authentication across all accounts
  • Assign roles per environment for consistent access
Encryption & Key Management:
  • Create environment-specific AWS KMS CMKs for EBS, S3, RDS
  • Restrict access per environment using key policies
Network Security:
  • Security groups and NACLs enforce isolation
  • No direct access from non-prod to production unless via approved pipelines or VPN
Additional Security Measures:
  • Enable MFA for all privileged users
  • Log all administrative actions for auditability

Step 4: Environment Isolation
Purpose: Prevent accidental impact between environments.
Implementation:
Network Isolation:
  • Each VPC is in its own AWS account
  • Transit Gateway routes only approved connections
Logging & Monitoring:
  • Enable CloudTrail, CloudWatch, and AWS Config per environment
  • Optionally, send logs to a centralized Logging Account
Access Control:
  • Only approved pipelines or VPNs may access Production from non-prod environments
Step 5: Infrastructure Automation
Purpose: Standardize deployments, reduce human error, and allow scalability.
Action Items:
Use CloudFormation StackSets or Terraform to automate:
  • VPC creation, subnets, and routing
  • IAM roles and SSO groups
  • KMS key creation
  • Shared Services deployments
Track costs and usage per environment with AWS Cost Explorer and Budgets

Step 6: Optional Future Environments
  • Training Account: For SAP training environments
  • Disaster Recovery Account: DR site for Production workloads
Both can be added by creating new accounts, VPCs, and attaching to Network TGW, with minimal impact on existing setup.

Diagram – Environment to Account Mapping:

Environment Mapping:
Environment      AWS Account
---------------------------------------
Production       Production Account
Development      Development Account
Pre-Production   Pre-Prod Account
Quality (QA)     QA Account
Sandbox          Sandbox Account

Advantages
Security: Isolated blasts; centralized monitoring.
Governance: Tags/SCPs for compliance.
Cost: Per-account visibility.
Scale: Add accounts seamlessly.

AWS Landscape 1: AWS Organization

Modern enterprises running workloads on AWS benefit greatly from a multi-account strategy. Using multiple accounts improves security, simplifies management, and allows precise cost tracking. AWS Organizations provides a centralized framework to manage multiple accounts, enforce governance using Service Control Policies (SCPs), and consolidate billing.

A well-structured AWS Organization ensures:
  • Clear separation between production, security, and non-production environments.
  • Centralized governance for auditing, compliance, and security.
  • Scalability to support growth and automated account provisioning.
  • Simplified cross-account operations for shared services and management.
This structure enables organizations to follow AWS best practices while keeping development and testing environments flexible.

Objective:
Define a robust AWS Organization structure that:
  • Provides centralized management of AWS accounts.
  • Enforces governance and security policies consistently.
  • Enables consolidated billing and cost visibility.
  • Supports scalable growth and automated account provisioning.
We recommend a hierarchical AWS Organization structure using Organizational Units (OUs) to separate workloads by environment and 

Root OU (Master Payer Account only)
├── Production OU
│   └── Production Account (e.g., prd-web-001)
├── Security OU
│   ├── Security Account
│   ├── Logging Account
│   ├── Shared Services Account
│   └── Network Services Account
└── Non-Production OU
    ├── Development Account (e.g., dev-app-001)
    ├── QA Account
    ├── Pre-Prod Account
    └── Sandbox Account

OU Purpose:
  • Root OU – Holds only the Master Payer Account (MPA).
  • Production OU – Hosts production workloads.
  • Security OU – Manages security, logging, and audit accounts.
  • Non-Production OU – Development, QA, Pre-Prod, and Sandbox environments.
Benefits of this OU structure:
  • Clear separation of environments for governance and billing.
  • Easier application of SCPs at the OU level.
  • Supports future account growth and automation.
  • Improves operational visibility and compliance.
Technical Implementation Steps:

Step 1: Create AWS Organization
Log in to AWS Console
URL: https://aws.amazon.com/console/
Use the Master Payer Account (MPA) credentials.

Open AWS Organizations
Search Organizations in the console.
Click AWS Organizations.

Create Organization
Click Create organization.
Select Enable all features (required for SCPs and full governance).
Review warnings and click Create organization.

Verify Creation
Ensure your MPA is shown as the master account.
You can now invite existing accounts or create new accounts.

Step 2: Create Organizational Units (OUs)
In AWS Organizations → Organize accounts.
Click Add an organizational unit and create:
Production (under Root)
Security (under Root)
Non-Production (under Root)

Step 3: Create AWS Accounts under OUs
Use the MPA to create accounts via AWS Organizations console or CLI. Example accounts:

OU         Accounts                                         Naming Example
Production Production Account                                 prd-web-001
Security Security, Logging, Shared Services, Network Services sec-log-001, shared-svcs-001
Non-Production Development, QA, Pre-Prod, Sandbox                 dev-app-001, qa-db-001

Notes:
Master Payer Account stays at Root.
Use consistent naming conventions for clarity, e.g., prd-web-001, dev-app-002.

Step 4: Apply Service Control Policies (SCPs)
Attach policies at OU or account level via AWS Organizations → Policies.
OU         SCP Guidance
Production Strict: Allow only essential services (e.g., deny public S3 buckets).
Non-Production Flexible: Permit dev tools (e.g., allow EC2 spot instances).
Security Restricted: Security services only (e.g., deny non-security actions).

Best Practices for SCPs
Test in Sandbox first.
Layer with IAM policies for fine-grained control.
Use AWS-managed policies as starters (e.g., "RestrictEC2InstanceTypes").

Central Services in Security OU
AWS CloudTrail: Organization-wide logging.
AWS Config: Resource compliance monitoring.
AWS Security Hub: Central security dashboard.
AWS GuardDuty: Threat detection.

Key Advantages
Security: Strong isolation; centralized monitoring.
Governance: OU-level SCP guardrails.
Scalability: Add accounts easily; automate with AWS Control Tower.
Cost Management: Consolidated billing with tags for environment tracking.

Architecture Diagram Overview
Security OU Central Services

• AWS CloudTrail  → Organization-wide logging
• AWS Config      → Resource compliance monitoring
• AWS Security Hub → Central security dashboard
• AWS GuardDuty   → Threat detection

Puppet

Puppet is a configuration management tool used to automate server and infrastructure management. Instead of manually installing software or changing configurations, Puppet ensures the desired state is automatically applied across all machines.

Puppet Server:
Stores Manifests/Modules: Keeps configuration files (.pp) and modules (organized sets of manifests, templates, files).
Signs Agent Certificates: Approves agent SSL requests for secure communication.
Compiles Catalogs: Generates a personalized set of instructions for each agent using manifests, modules, facts, and Hiera data.
Environment Management: Manages separate environments (production, staging, development) so configurations can differ per environment.
Node Classifications: Assigns classes/modules to nodes using an external node classifier (ENC) or PuppetDB.
PuppetDB Integration: Stores facts, catalogs, and reports for querying and analytics.

Puppet Agent:
Requests Catalogs: Contacts the server to get instructions.
Applies Configuration Changes: Updates the system to match the desired state.
Reports Back: Sends a summary of changes and status to the server.
Fact Collection: Gathers system info using Facter.
Idempotency: Ensures applying the same catalog multiple times won’t cause unintended changes.
No-Operation (noop) Mode: Can test changes without applying them.

Key Components:
Manifest: File in Puppet DSL defining desired state.
Module: Collection of manifests, templates, and files for reusable configuration.
Facter: Tool on the agent that provides system information.
PuppetDB: Stores node data, facts, catalogs, and reports.
Hiera: Hierarchical data lookup to separate data from code.
Resource Types: Built-in resource types (file, package, service, user, group, cron, etc.).
Defined Types: User-defined resources for reusable configuration blocks.
Functions: Custom Puppet or Ruby functions for reusable logic.
Tasks / Plans: Bolt-compatible tasks and orchestration plans.

1. Puppet Server Installation:
1.1: Add Puppet Repository
Puppet provides its own repository for installation.
On RHEL/CentOS:
$ sudo rpm -Uvh https://yum.puppet.com/puppet7-release-el-8.noarch.rpm
sudo dnf install puppetserver -y
On Ubuntu:
wget https://apt.puppet.com/puppet7-release-jammy.deb
sudo dpkg -i puppet7-release-jammy.deb
sudo apt-get update
sudo apt-get install puppetserver -y

1.2: Configure Puppet Server
By default, Puppet Server allocates 2 GB RAM for the JVM. You can adjust /etc/sysconfig/puppetserver (RHEL/CentOS) or /etc/default/puppetserver (Ubuntu):
JAVA_ARGS="-Xms512m -Xmx512m"

1.3: Start and Enable Puppet Server
sudo systemctl enable puppetserver
sudo systemctl start puppetserver
sudo systemctl status puppetserver

1.4: Open Firewall (if applicable)
Puppet Server listens on TCP port 8140:
sudo firewall-cmd --permanent --add-port=8140/tcp
sudo firewall-cmd --reload

2. Puppet Agent (Client) Installation:
2.1: Add Puppet Repository
Use the same repository as the server.
On RHEL/CentOS:
sudo rpm -Uvh https://yum.puppet.com/puppet7-release-el-8.noarch.rpm
sudo dnf install puppet-agent -y
On Ubuntu:
sudo apt-get install puppet-agent -y

2.2: Configure Puppet Agent
Edit /etc/puppetlabs/puppet/puppet.conf:
[main]
server = puppet.example.com
certname = agent1.example.com
environment = production
runinterval = 30m
Replace puppet.example.com with your Puppet Server hostname.
certname is the unique identity for this agent.

2.3: Start and Enable Puppet Agent
sudo systemctl enable puppet
sudo systemctl start puppet
sudo systemctl status puppet

2.4: Sign Agent Certificate on Server
On the Puppet Server:
sudo puppetserver ca list  # shows pending certificate requests
sudo puppetserver ca sign --certname agent1.example.com
On the agent, test the connection:
sudo puppet agent -t
It should now communicate with the server successfully.

Puppet Resource Parameters:
Parameter                 Description                                              Example
ensure               ---> Desired state 
                                         ---> ensure => present
owner                ---> File owner (user)                                     ---> owner => 'root'
group                ---> File group                                             ---> group => 'root'
mode                 ---> File permissions (numeric)                             ---> mode => '0644'
content              ---> Text content for a file                               ---> content => 'Hello Puppet!'
source               ---> Source file location from Puppet module               ---> source => 'puppet:///modules/my_module/file.txt'
require              ---> Dependency on another resource                         ---> require => Package['nginx']
recurse              ---> Apply resource to children (files/dirs)               ---> recurse => true
enable               ---> Enable service at boot                                 ---> enable => true
provider             ---> Package manager or resource provider (yum, apt, gem)  ---> provider => 'yum'
path                 ---> Full path to file or executable                       ---> path => '/usr/bin/custom'
shell                ---> Default shell for user accounts                       ---> shell => '/bin/bash'
home                 ---> Home directory for users                               ---> home => '/home/john'
managehome           ---> Whether to create/manage user home                     ---> managehome => true
uid                  ---> User ID number                                         ---> uid => 1050
gid                  ---> Group ID number or name                               ---> gid => 'users'
hasstatus            ---> Service supports status check                         ---> hasstatus => true
hasrestart           ---> Service supports restart                               ---> hasrestart => true
subscribe            ---> Trigger resource refresh when another resource changes ---> subscribe => File['/tmp/config.txt']
notify               ---> Notify another resource to take action after changes ---> notify => Service['nginx']
backup               ---> Backup original file before changes                   ---> backup => true
replace              ---> Whether to overwrite file content                     ---> replace => true
force                ---> Force overwrite (files, links)                         ---> force => true
target               ---> Target path for symlinks or files                     ---> target => '/etc/nginx/conf.d/default.conf'
links                ---> Behavior for managing symlinks (follow, manage)    ---> links => manage
source_permissions   ---> Preserve permissions from source files                 ---> source_permissions => use
creates              ---> Only run command if file does not exist               ---> creates => '/tmp/done.txt'
command              ---> Execute a shell command                               ---> command => '/usr/bin/touch /tmp/file.txt'
cwd                  ---> Working directory for command execution               ---> cwd => '/tmp'
environment          ---> Environment variables for command                     ---> environment => ['PATH=/usr/bin:/bin']
onlyif               ---> Run command only if condition is true                 ---> onlyif => 'test -f /tmp/file.txt'
unless               ---> Run command unless condition is true                   ---> unless => 'test -f /tmp/file.txt'
timeout              ---> Timeout for command execution                         ---> timeout => 60
path (exec)          ---> Paths for executables when running commands   ---> path => ['/usr/bin','/bin']
loglevel             ---> Logging level for resource                             ---> loglevel => 'notice'
tag                  ---> Assign a tag to resource                               ---> tag => 'webserver'
seltype              ---> SELinux file type                                     ---> seltype => 'httpd_sys_content_t'
selrole              ---> SELinux role                                           ---> selrole => 'object_r'
seluser              ---> SELinux user                                           ---> seluser => 'system_u'
acl                  ---> File access control lists (Linux/Solaris)             ---> acl => ['user::rwx','group::r-x']
dacl                 ---> File access control list (Windows)                     ---> dacl => 'BUILTIN\Administrators:F'
password             ---> User account password hash                             ---> password => 'hashed_password_here'
groups               ---> Additional groups for user                             ---> groups => ['sudo','adm']
refreshonly          ---> Run exec resource only on refresh                     ---> refreshonly => true
resource_name        ---> Name of the resource instance                         ---> resource_name => 'nginx_service'
ensure_packages      ---> Ensures multiple packages are present/absent ---> ensure_packages => ['nginx','git']
path (file)          ---> Search path for file sources                           ---> path => ['/etc/puppet/files','/usr/local/files']
replace (line)       ---> Whether to replace a line in a file                   ---> replace => true
match                ---> Used in file_line to match regex patterns             ---> match => '^Listen'
line                 ---> Used in file_line to define content                   ---> line => 'Listen 8080'
notify (exec)        ---> Notify an exec resource to run after change     ---> notify => Exec['restart_nginx']
tries / try_sleep    ---> Retry execution for commands                           ---> tries => 3, try_sleep => 5
audit                ---> Track specific attributes without enforcing changes  ---> audit => ['owner','mode']
purge                ---> Remove unmanaged files in directories               ---> purge => true
source_dir           ---> Specify source directory for copying files recursively  ---> source_dir => 'puppet:///modules/my_module/dir/'

Cloud / Virtualization Specific Parameters:
Parameter               Description                                                Example
region             ---> AWS region                            ---> region => 'us-east-1'
access_key         ---> AWS access key                        ---> access_key => 'AKIA...'
secret_key         ---> AWS secret key                        ---> secret_key => '...'
instance_type      ---> AWS EC2 instance type                 ---> instance_type => 't2.micro'
image_id           ---> AWS AMI ID                            ---> image_id => 'ami-123456'
state              ---> Desired state for cloud/VM resources  ---> state => 'running'
subnet_id          ---> AWS subnet ID                         ---> subnet_id => 'subnet-12345'
security_group_ids ---> AWS security groups                   ---> security_group_ids => ['sg-12345','sg-67890']
key_name           ---> SSH key for VM access                 ---> key_name => 'my-key'
tags               ---> Assign tags to cloud resources        ---> tags => {'env'=>'prod','role'=>'web'}
disk_size          ---> VM disk size in GB                    ---> disk_size => 50
os_type            ---> OS type for VM deployment             ---> os_type => 'Linux'
network_interface  ---> Network interface attachment          ---> network_interface => 'eth0'
zone               ---> GCP compute zone                      ---> zone => 'us-central1-a'
machine_type       ---> GCP instance type                     ---> machine_type => 'n1-standard-1'
image              ---> VM image / OS                         ---> image => 'debian-11'
project            ---> GCP project name                      ---> project => 'my-gcp-project'
image_project      ---> GCP image project                     ---> image_project => 'debian-cloud'
resource_group     ---> Azure resource group                  ---> resource_group => 'my-rg'
location           ---> Azure region                          ---> location => 'eastus'
vm_size            ---> Azure VM size                         ---> vm_size => 'Standard_B1s'
availability_set   ---> Azure availability set                ---> availability_set => 'my-avset'
datacenter         ---> VMware datacenter                     ---> datacenter => 'DC1'
cluster            ---> VMware cluster name                   ---> cluster => 'Cluster1'
guest_id           ---> VMware guest OS type                  ---> guest_id => 'ubuntu64Guest'
cpu                ---> Number of CPUs for VM                 ---> cpu => 2
memory             ---> VM memory in MB                       ---> memory => 2048

Puppet Modules:
Self-contained collection of manifests, files, templates, and other resources.
Helps structure Puppet code for easy maintenance, sharing, and scaling.
Location: Typically stored in /etc/puppetlabs/code/environments/production/modules/<module_name>/.

Module Structure:
my_module/
├── manifests/      # .pp files
├── files/          # Static files
├── templates/      # ERB templates
├── examples/       # Example manifests
├── tests/          # Unit/acceptance tests
├── data/           # Hiera data (optional)
├── lib/            # Custom Ruby functions/types
└── metadata.json   # Module metadata

Module Metadata (metadata.json) Parameters:
name ---> Module name ---> name => 'my_module'
version ---> Module version ---> version => '1.0.0'
author ---> Author/maintainer ---> author => 'John Doe'
summary ---> Short module description ---> summary => 'Installs and configures Nginx'
license ---> License type ---> license => 'Apache-2.0'
source ---> Module source URL ---> source => 'https://github.com/example/my_module'
dependencies ---> Module dependencies ---> dependencies => [{"name"=>"puppetlabs/stdlib","version_requirement"=>">= 4.25.0"}]
operatingsystem_support ---> Supported OS and versions ---> operatingsystem_support => [{"operatingsystem"=>"Ubuntu","operatingsystemrelease"=>["20.04","22.04"]}]
project_page ---> URL of project homepage ---> project_page => 'https://example.com/my_module'
issues_url ---> URL for bug reports/issues ---> issues_url => 'https://github.com/example/my_module/issues'
tags ---> Keywords for module categorization ---> tags => ['nginx','webserver','puppet']

Class Parameters & Data Types:
Example class:
class nginx (
  String $package_name = 'nginx',
  Boolean $service_enable = true,
  String $config_file_path = '/etc/nginx/nginx.conf',
  String $config_template = 'nginx/nginx.conf.erb',
  String $service_ensure = 'running',
  String $file_owner = 'root',
  String $file_group = 'root',
  String $file_mode = '0644',
) {
  package { $package_name: ensure => installed }
  service { $package_name: ensure => $service_ensure, enable => $service_enable }
  file { $config_file_path:
    ensure  => file,
    content => template($config_template),
    owner   => $file_owner,
    group   => $file_group,
    mode    => $file_mode,
  }
}
package_name ---> Name of package ---> package_name => 'nginx'
service_enable ---> Enable service at boot ---> service_enable => true
config_file_path ---> Path to config file ---> config_file_path => '/etc/nginx/nginx.conf'
config_template ---> ERB template file ---> config_template => 'nginx/nginx.conf.erb'
service_ensure ---> Desired service state ---> service_ensure => 'running'
file_owner ---> Owner of managed files ---> file_owner => 'root'
file_group ---> Group of managed files ---> file_group => 'root'
file_mode ---> File permissions ---> file_mode => '0644'

Supported Data Types:
String, Boolean, Integer, Array[String], Hash[String, Any]

Defined Types & Virtual Resources
Example:
define my_module::vhost (
  String $docroot,
  String $port = '80',
) {
  file { "/var/www/${title}":
    ensure => directory,
    owner  => 'www-data',
    group  => 'www-data',
  }
}
docroot ---> Root directory of vhost ---> docroot => '/var/www/example'
port ---> Port number ---> port => '80'
virtual_resource ---> Declare without applying ---> virtual_resource => true
resource_collector ---> Collect resources dynamically ---> resource_collector => File <<| title == '/tmp/test' |>>

Hiera Integration:
Example:
$package_name = lookup('my_module::package_name', String, 'first', 'nginx')
lookup_key ---> Key to fetch from Hiera ---> lookup_key => 'my_module::package_name'
default_value ---> Value if key is missing ---> default_value => 'nginx'
hiera_hierarchy ---> Lookup hierarchy ---> hiera_hierarchy => ['nodes/%{::fqdn}.yaml','common.yaml']

Module Testing
Unit Tests: puppetlabs_spec_helper, rspec-puppet
Acceptance Tests: Beaker
Directory convention:
tests/
├── init.pp       # Basic class test
├── webserver.pp  # Specific class test
test_file ---> Path to test manifest ---> test_file => '/etc/puppetlabs/code/environments/production/modules/my_module/tests/init.pp'

Ansible

Ansible is an open-source IT automation tool used for configuration management, application deployment, and task orchestration. It is agentless, works over SSH/WinRM, and uses YAML-based playbooks to define tasks. Key features include idempotency, modularity (modules & roles), and easy inventory management.

Ansible Basics:
Agentless ---> Ansible doesn’t require agents; works over SSH (Linux/Unix) or WinRM (Windows)
Idempotent ---> Running the same playbook multiple times won’t change system if already in desired state
YAML-based Playbooks ---> Human-readable YAML files define tasks
Modules ---> Perform specific tasks (install packages, copy files, manage services)
Inventory Management ---> Hosts defined in inventory file (static or dynamic)
Roles ---> Modular units of tasks, handlers, variables for reusable playbooks
Collections ---> Bundles of modules, plugins, and roles distributed together
Handlers ---> Tasks triggered by notify (usually used to restart services)
Plugins ---> Extend Ansible functionality (connection, callback, filter, lookup)
Facts ---> System information gathered automatically from hosts

Playbook Variables / Structure:
name: <Playbook name>        ---> Description of the playbook
hosts: <host/group>          ---> Target machines or groups
become: <yes/no>             ---> Use sudo/root privileges
gather_facts: <yes/no>       ---> Collect system facts
vars:                        ---> Define custom variables
<var_name>: <value>          ---> Assign value
vars_files:                  ---> Load variables from external YAML files
  - <path/to/vars_file.yml>
register: <variable_name>    ---> Capture task output
loop:                        ---> Iterate over a list of items
with_items:                  ---> Legacy form of loop
when: <condition>            ---> Conditional task execution
tasks:                       ---> List of tasks to execute
handlers:                    ---> Special tasks triggered by notify
notify: <handler_name>       ---> Trigger handler if task changes
environment:                 ---> Set environment variables for task
include_tasks: <file.yml>    ---> Include another task file
import_playbook: <file.yml>  ---> Import another playbook

Common Playbook Variables:
Linux / Unix:
ansible_user: <username>             ---> SSH login user
ansible_password: <password>         ---> SSH password
ansible_port: <port>                 ---> SSH port (default 22)
ansible_become: <yes/no>             ---> Run tasks as root/sudo
ansible_become_method: <sudo/su>     ---> Privilege escalation method
ansible_python_interpreter: <path>   ---> Python interpreter on host
ansible_host: <ip/hostname>        ---> Target host address
ansible_private_key_file: <path>   ---> SSH private key for authentication
ansible_ssh_common_args: <args>    ---> Extra SSH arguments

Windows:
ansible_connection: winrm            ---> Use WinRM for Windows hosts
ansible_winrm_transport: <ntlm/kerberos> ---> Windows transport type
ansible_winrm_server_cert_validation: <ignore/validate> ---> SSL validation
ansible_user: <username>           ---> Windows login user
ansible_password: <password>       ---> Windows password
ansible_port: 5985 ---> HTTP WinRM
ansible_port: 5986 ---> HTTPS WinRM

VMware:
vcenter_hostname: <hostname>         ---> vCenter server
vcenter_user: <username>             ---> vCenter login
vcenter_password: <password>         ---> vCenter password
validate_certs: <yes/no>           ---> SSL certificate validation
datacenter: <dc_name>              ---> VMware datacenter
cluster: <cluster_name>            ---> VMware cluster

AWS / Cloud:
aws_access_key: <key>                ---> AWS access key
aws_secret_key: <secret>             ---> AWS secret key
aws_region: <region>                 ---> AWS region
aws_profile: <profile_name>        ---> AWS CLI profile
aws_session_token: <token>         ---> Temporary AWS session token

Control Variables / Execution:
tags: <tag_name>                     ---> Run only tasks with this tag
ignore_errors: <yes/no>              ---> Continue if task fails
timeout: <seconds>                   ---> Task timeout
retries: <number>                    ---> Retry task on failure
delay: <seconds>                     ---> Wait before retrying
run_once: yes                         ---> Execute task only once
delegate_to: <host>                  ---> Run task on another host
changed_when: <condition>            ---> Custom “changed” status
failed_when: <condition>             ---> Custom failure detection
serial: <number>                   ---> Run playbook on hosts in batches
max_fail_percentage: <percent>     ---> Stop play if failure threshold reached
any_errors_fatal: <yes/no>         ---> Stop play on first error
throttle: <number>                 ---> Limit parallel task execution

CLI Commands / Options:
ansible <host/group> -m ping                     ---> Test connectivity
ansible <host/group> -m command -a "<command>"   ---> Run command
ansible <host/group> -m shell -a "<shell_cmd>"   ---> Run shell command
ansible <host/group> -m copy -a "src=<src> dest=<dest>" ---> Copy file
ansible-playbook <playbook.yml>                  ---> Run playbook
ansible-playbook <playbook.yml> --check          ---> Dry-run
ansible-playbook <playbook.yml> --diff           ---> Show changes
ansible-playbook <playbook.yml> --tags "<tag>"  ---> Run tasks with tag
ansible-playbook <playbook.yml> --skip-tags "<tag>" ---> Skip tasks with tag
ansible-inventory -i <inventory_file> --list     ---> Show inventory hosts
ansible-doc -l                                   ---> List all modules
ansible-doc <module_name>                        ---> Show module documentation
ansible-galaxy install <role_name>              ---> Install role from Galaxy
ansible-galaxy init <role_name>                 ---> Create new role
ansible-vault create/edit/view/encrypt/decrypt ---> Vault operations
ansible-config list                           ---> Show all configuration options
ansible-config view                           ---> View active configuration
ansible-playbook <playbook.yml> -i <inventory> ---> Specify inventory
ansible-playbook <playbook.yml> -l <host>     ---> Limit execution to host/group
ansible-playbook <playbook.yml> -u <user>     ---> Remote user
ansible-playbook <playbook.yml> -b            ---> Run with become
ansible-playbook <playbook.yml> -vvv          ---> Debug verbosity

Environment Variables:
ANSIBLE_CONFIG: <path>                   ---> Use custom ansible.cfg
ANSIBLE_INVENTORY: <path>                ---> Default inventory
ANSIBLE_ROLES_PATH: <path>               ---> Path for roles
ANSIBLE_STDOUT_CALLBACK: <default/json/yaml> ---> Output format
ANSIBLE_HOST_KEY_CHECKING: <True/False>  ---> Disable SSH host key check
ANSIBLE_LOG_PATH: <path>              ---> Log file location
ANSIBLE_RETRY_FILES_ENABLED: <True/False> ---> Create retry files
ANSIBLE_TIMEOUT: <seconds>            ---> SSH timeout

Modules (Examples):

Package Management

apt/yum/dnf/pkg: <package_name> state=<present/absent/latest> ---> Install/remove packages
command ---> Does not process shell operators (|, >, &&)
shell   ---> Executes command via shell and supports pipes/redirection
git: repo=<url> dest=<path> version=<branch>              ---> Git repo
win_package/win_copy/win_command/win_shell               ---> Windows tasks

Services
service/win_service: name=<service> state=<started/stopped/enabled> ---> Manage services

Files
copy/file/template: src/dest/owner/group/mode           ---> File management

Users / Groups
user/group: name=<name> state=<present/absent>          ---> Manage users/groups

Cron / Scheduling
cron: name=<job> minute=<0-59> hour=<0-23> user=<user> job=<command> ---> Schedule jobs

Networking
uri: url=<endpoint> method=<GET/POST> ---> Call REST API
get_url: url=<file_url> dest=<path>    ---> Download file
unarchive: src=<file> dest=<path>      ---> Extract archive

Cloud / VMware
vmware_guest: hostname/user/password name=<vm> state=<poweredon/poweredoff/present/absent>
amazon.aws.ec2_instance: key_name/instance_type/image_id/region/state
azure.azcollection.azure_rm_virtualmachine: resource_group/name/vm_size/admin_username/image/state
google.cloud.gcp_compute_instance: name/machine_type/zone/image/project/state

Control Flow / Conditions:

Conditional Execution

when: <condition>                                ---> Run task only if true
register: <variable>                             ---> Capture task output
loop / with_items: <list>                        ---> Iterate over items
retries + delay + until: <condition>             ---> Retry task until success (do-while)
block/rescue/always:                             ---> Try-catch-finally
ignore_errors: yes                               ---> Continue on failure
changed_when / failed_when: <condition>          ---> Override status
until: <condition> ---> Retry until condition met
set_fact: <var>=<value> ---> Define runtime variables
include_role: name=<role> ---> Execute role inside task

Examples:
package_name: "{{ 'vim' if ansible_facts['os_family']=='Debian' else 'vim-enhanced' }}"
- name: Install packages with loop
  apt: name="{{ item }}" state=present
  loop: "{{ debian_packages if ansible_facts['os_family']=='Debian' else redhat_packages }}"
- block:
    - command: /usr/bin/do_something
  rescue:
    - debug: msg="Command failed"
  always:
    - debug: msg="Always runs"

Dynamic Variables:

ansible_facts['os_family']        ---> OS family
ansible_facts['distribution']     ---> OS distribution
ansible_facts['distribution_version'] ---> Version
ansible_facts['hostname']         ---> Hostname
ansible_facts['interfaces']       ---> Network interfaces
ansible_facts['memory_mb']        ---> Memory info
ansible_facts['processor']        ---> CPU info
ansible_facts['fqdn']              ---> Fully qualified domain name
ansible_facts['architecture']      ---> CPU architecture
ansible_facts['uptime_seconds']    ---> System uptime
ansible_facts['default_ipv4']      ---> Default IPv4 address
ansible_facts['mounts']            ---> Mounted filesystems

Standard Ansible project directory structure:

ansible-project/
├── ansible.cfg
├── inventory/
│   ├── hosts
│   └── production
├── group_vars/
│   └── all.yml
├── host_vars/
│   └── server1.yml
├── roles/
│   └── webserver/
│       ├── tasks/
│       │   └── main.yml
│       ├── handlers/
│       │   └── main.yml
│       ├── templates/
│       ├── files/
│       ├── vars/
│       │   └── main.yml
│       ├── defaults/
│       │   └── main.yml
│       └── meta/
│           └── main.yml
├── playbooks/
│   └── deploy.yml
└── files/

Explanation:
ansible.cfg ---> Main configuration file
inventory/ ---> Host inventory files
group_vars/ ---> Variables for host groups
host_vars/ ---> Variables for specific hosts
roles/ ---> Reusable automation roles
playbooks/ ---> Playbook YAML files
files/ ---> Static files used by playbooks
templates/ ---> Jinja2 templates

Ansible Vault (Secrets Management):
Ansible Vault is used to encrypt sensitive data such as passwords, API keys, and credentials.

Vault Features:
Encryption ---> Protect sensitive variables
Decryption ---> Access secrets during playbook execution
File-level security ---> Encrypt full YAML files
Variable-level security ---> Encrypt individual variables

Vault Commands:
ansible-vault create <file.yml>        ---> Create encrypted file
ansible-vault edit <file.yml>          ---> Edit encrypted file
ansible-vault view <file.yml>          ---> View encrypted file
ansible-vault encrypt <file.yml>       ---> Encrypt existing file
ansible-vault decrypt <file.yml>       ---> Decrypt file
ansible-vault rekey <file.yml>         ---> Change vault password

Using Vault in Playbook Execution:
ansible-playbook playbook.yml --ask-vault-pass
ansible-playbook playbook.yml --vault-password-file <file>

Example Encrypted Variable:
db_password: !vault |
          $ANSIBLE_VAULT;1.1;AES256
          6539646533343231646464...

Dynamic Inventory:
Dynamic inventory allows Ansible to automatically fetch hosts from cloud providers or external systems.

Instead of static host lists, inventory is generated dynamically using scripts or plugins.

Dynamic Inventory Sources:
AWS ---> EC2 instances
VMware ---> vCenter virtual machines
Azure ---> Azure VMs
GCP ---> Google Cloud instances
Kubernetes ---> Pods and nodes

Example Dynamic Inventory Plugins:
amazon.aws.aws_ec2 ---> AWS EC2 inventory
vmware.vmware_vm_inventory ---> VMware vCenter inventory
azure.azcollection.azure_rm ---> Azure inventory
google.cloud.gcp_compute ---> Google Cloud inventory
kubernetes.core.k8s ---> Kubernetes resources

Example AWS Dynamic Inventory Configuration:
plugin: amazon.aws.aws_ec2
regions:
  - us-east-1
filters:
  instance-state-name: running
keyed_groups:
  - key: tags.Name

Command to test dynamic inventory:
ansible-inventory -i aws_ec2.yml --graph
ansible-inventory -i aws_ec2.yml --list

Ansible Configuration File (ansible.cfg):
The ansible.cfg file controls Ansible behavior and execution settings.

Ansible checks configuration in the following order:
ANSIBLE_CONFIG (environment variable)
./ansible.cfg (current project directory)
~/.ansible.cfg (user home directory)
/etc/ansible/ansible.cfg (global configuration)

Common Configuration Options:
inventory = <path>                ---> Default inventory file
roles_path = <path>               ---> Location of roles
forks = <number>                  ---> Number of parallel hosts
timeout = <seconds>               ---> SSH connection timeout
remote_user = <username>          ---> Default SSH user
host_key_checking = <True/False>  ---> Enable/disable SSH host verification
retry_files_enabled = <True/False> ---> Create retry files on failure
log_path = <path>                 ---> Log file location
stdout_callback = <default/yaml/json> ---> Output formatting

Example ansible.cfg:
[defaults]
inventory = ./inventory/hosts
roles_path = ./roles
forks = 20
host_key_checking = False
retry_files_enabled = False
log_path = ./ansible.log

Ansible Roles (Best Practices):
Roles allow reusable and modular automation components.

Role Best Practices:
Single Responsibility ---> Each role should manage one component
Use defaults/main.yml ---> Store default variables
Avoid hardcoded values ---> Use variables
Use handlers ---> Restart services only when required
Keep tasks small ---> Maintain readability
Use tags ---> Enable selective execution

Example Role Task with Handler:
- name: Install nginx
  apt:
    name: nginx
    state: present
  notify: restart nginx

handlers:
  - name: restart nginx
    service:
      name: nginx
      state: restarted

Role Dependencies (meta/main.yml):
dependencies:
  - role: common
  - role: security

Performance Optimization:
For large infrastructures, tuning Ansible improves execution speed.

Performance Settings:
forks: <number>                   ---> Increase parallel execution
pipelining: True                  ---> Reduce SSH operations
ssh_args: ControlMaster options   ---> Reuse SSH connections
fact_caching: <method>            ---> Cache gathered facts
gather_facts: no                  ---> Disable if not needed

Example Performance Configuration:
[defaults]
forks = 50
gathering = smart
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible_cache

[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=60s

Benefits:
Faster execution ---> Parallel task processing
Reduced SSH overhead ---> Persistent connections
Lower API calls ---> Cached system facts

Troubleshooting & Debugging:
Common troubleshooting methods for Ansible failures.

Debugging Commands:
ansible-playbook playbook.yml -vvv        ---> Detailed debugging output
ansible-playbook playbook.yml --syntax-check ---> Validate playbook syntax
ansible-playbook playbook.yml --list-tasks ---> Show tasks without executing
ansible-playbook playbook.yml --list-hosts ---> Show targeted hosts
ansible-playbook playbook.yml --start-at-task "<task>" ---> Start from specific task

Debug Module:
- debug:
    msg: "Variable value is {{ variable_name }}"

Check Variable Values:

- debug:
    var: ansible_facts

Common Issues:
SSH authentication failure ---> Check user, key, and permissions
Python missing on target ---> Install Python on remote host
Permission denied ---> Use become: yes
Host unreachable ---> Verify inventory and network access
Module failure ---> Run with -vvv for detailed logs

Ansible Security Best Practices:
Security practices to protect infrastructure automation and credentials.

Credential Protection:
Use Ansible Vault ---> Encrypt sensitive data such as passwords, API keys, and tokens
Avoid plaintext secrets ---> Never store credentials directly in playbooks
Use vault password file ---> Store vault password securely
Use environment variables ---> Pass sensitive data dynamically

Authentication:
Use SSH keys ---> Prefer key-based authentication instead of passwords
Restrict SSH access ---> Limit login users and disable root login
Rotate credentials ---> Regularly update keys and passwords
Use strong key types ---> RSA 4096 or ED25519 recommended

Privilege Escalation:
become: yes ---> Use privilege escalation only when required
Limit sudo access ---> Restrict commands allowed via sudo
Avoid running entire playbook as root ---> Use become only for specific tasks

Access Control:
Role-Based Access ---> Restrict who can run playbooks
Inventory segmentation ---> Separate environments (dev/test/prod)
Use separate credentials ---> Different credentials for each environment

File & Secret Management:
Protect inventory files ---> Restrict file permissions
Store secrets in vault ---> Avoid exposing credentials in Git
Use secret management tools ---> Integrate with external vault systems

External Secret Integration:
HashiCorp Vault ---> Secure dynamic secrets
AWS Secrets Manager ---> Store cloud credentials
Azure Key Vault ---> Manage Azure secrets
CyberArk ---> Enterprise privileged access management

Repository Security:
Use private repositories ---> Restrict access to automation code
Enable code reviews ---> Validate playbook changes
Use CI/CD security scanning ---> Detect exposed secrets

Audit & Logging:
Enable logging ---> Record automation activity
Centralized logs ---> Send logs to SIEM systems
Track playbook runs ---> Maintain execution history

Example Secure File Permissions:
chmod 600 inventory
chmod 600 group_vars/*
chmod 600 host_vars/*

Ansible Quick Reference:

Connectivity Test:
ansible <host/group> -m ping                     ---> Test connection to hosts
ansible all -m ping                              ---> Test all hosts in inventory

Ad-hoc Commands:
ansible <host/group> -m command -a "<command>"   ---> Run command on hosts
ansible <host/group> -m shell -a "<command>"     ---> Run shell command
ansible <host/group> -m copy -a "src=<src> dest=<dest>" ---> Copy file
ansible <host/group> -m file -a "path=<path> state=directory" ---> Create directory

Playbook Execution:
ansible-playbook playbook.yml                    ---> Run playbook
ansible-playbook playbook.yml -i inventory       ---> Specify inventory
ansible-playbook playbook.yml -l <host/group>    ---> Limit execution to hosts
ansible-playbook playbook.yml -u <user>          ---> Specify SSH user
ansible-playbook playbook.yml -b                 ---> Run with sudo/become

Dry Run / Debug:
ansible-playbook playbook.yml --check            ---> Dry run (no changes)
ansible-playbook playbook.yml --diff             ---> Show file differences
ansible-playbook playbook.yml -vvv               ---> Detailed debugging
ansible-playbook playbook.yml --syntax-check     ---> Validate playbook syntax

Tags Execution:
ansible-playbook playbook.yml --tags "<tag>"     ---> Run specific tagged tasks
ansible-playbook playbook.yml --skip-tags "<tag>" ---> Skip tagged tasks

Inventory Commands:
ansible-inventory -i inventory --list            ---> Show inventory details
ansible-inventory -i inventory --graph           ---> Show host groups

Module Documentation:
ansible-doc -l                                   ---> List all modules
ansible-doc <module_name>                        ---> Show module documentation

Role Management:
ansible-galaxy init <role_name>                  ---> Create role structure
ansible-galaxy install <role_name>               ---> Install role from Galaxy

Vault Commands:
ansible-vault create secrets.yml                 ---> Create encrypted file
ansible-vault edit secrets.yml                   ---> Edit encrypted file
ansible-vault encrypt secrets.yml                ---> Encrypt file
ansible-vault decrypt secrets.yml                ---> Decrypt file

Useful Variables:
ansible_facts['hostname']           ---> Hostname
ansible_facts['os_family']          ---> OS family
ansible_facts['distribution']       ---> Linux distribution
ansible_facts['default_ipv4']       ---> Default IP address