Policy Enforcement for Terraform & OpenTofu with OPA and Scalr

This summary outlines key findings on implementing policy enforcement for Infrastructure as Code (IaC) managed by Terraform and OpenTofu, focusing on Open Policy Agent (OPA) and Scalr.

1. The Need for IaC Governance

Managing infrastructure with Terraform or OpenTofu provides speed and consistency but also introduces risks without robust governance. These risks include security vulnerabilities, compliance breaches, cost overruns, and inconsistent configurations. Policy as Code (PaC) is the practice of defining and managing policies using code to mitigate these risks.

2. Open Policy Agent (OPA) and Rego

  • OPA: An open-source, general-purpose policy engine used to enforce policies across various systems, including Terraform/OpenTofu. It evaluates policies against JSON input, typically the Terraform plan.
  • Rego: A high-level declarative language used to write OPA policies. Policies define conditions that must hold true. Key features include its JSON-native design, rule-based structure, and modularity.
  • Evaluation Process: OPA evaluates Rego policies against a JSON representation of the Terraform plan (terraform show -json tfplan.binary). The plan's resource_changes array is crucial for inspecting proposed infrastructure modifications.
  • Tooling:
    • OPA CLI: For local policy evaluation, testing, and interactive development (opa eval, opa test, opa run).
    • Conftest: A utility for testing structured configuration data (like Terraform plans) against OPA policies, commonly used in CI/CD.

3. Scalr for IaC Policy and Governance

Scalr is a SaaS platform providing a remote state and operations backend for Terraform/OpenTofu, offering advanced governance features.

  • Hierarchical Structure: Organizes resources and policies via Account, Environment, and Workspace scopes, allowing for policy inheritance and centralized governance with decentralized operations.
  • Native OPA Integration:
    • Policies are managed via GitOps (stored in VCS, linked to Scalr).
    • Pre-plan Checks: Evaluate run context (initiator, VCS details) before plan generation. Cost-effective for early contextual validation.
    • Post-plan Checks: Evaluate the full Terraform plan JSON after plan generation for detailed resource validation.
    • Enforcement Levels: Policies can be set to Hard Mandatory (blocks run), Soft Mandatory (requires approval for post-plan violations), or Advisory (logs warning).
  • Policy Management: OPA policy groups are defined (often at Account scope using the Scalr Terraform provider) and linked to environments, with workspaces inheriting them.

4. Implementing Specific Policy Types

4.1. Cost Policies

  • Infracost: An open-source tool that estimates cloud costs from Terraform code/plans, outputting JSON.
  • OPA for Cost Control: Rego policies (in the infracost package) consume Infracost's JSON output to enforce budget limits (e.g., on input.diffTotalMonthlyCost) or restrict high-cost individual resources.
  • Scalr & Cost: Scalr provides cost estimation and can integrate Infracost data into its OPA policy checks.

4.2. Access Policies

  • Scalr RBAC (Role-Based Access Control):
    • Defines who can perform what actions at which scope using Roles, Subjects (Users, Teams, Service Accounts), Access Policies, and Scopes (Account, Environment, Workspace).
    • Managed via "Access Policy as Code" using the Scalr Terraform provider (scalr_role, scalr_iam_team, scalr_access_policy).
  • OPA for Fine-Grained Authorization: Within Scalr, OPA policies (pre-plan/post-plan) can enforce specific configuration constraints within an authorized action, complementing RBAC.

4.3. Audit Policies

  • Scalr Audit Logs: Capture detailed information about actions within the platform (who, what, when, how).
  • Streaming: Audit logs can be streamed to external systems like Datadog or AWS EventBridge for centralized logging, advanced analysis, long-term storage, and SIEM integration.
  • Monitoring: External systems are configured to alert on specific audit events (e.g., IAM changes, policy overrides, audit log disabling). Scalr sends an AuditLogDisabled event if streaming is tampered with.

4.4. Resource Policies

  • OPA and Rego: Used to define rules against the Terraform plan JSON to ensure compliance with organizational standards.
  • Common Use Cases:
    • Naming Conventions: Validating resource names using string functions (startswith, re.match).
    • Resource Type/Size Restrictions: Defining allowed lists for instance types, VM sizes, etc.
    • Region/Location Restrictions: Ensuring deployments only in approved geographic areas.
    • Mandatory Tags: Enforcing the presence and format of specific tags for cost allocation, automation, etc.
  • Enforcement: Via conftest in CI/CD or Scalr's integrated OPA checks.

5. Policy Enforcement in CI/CD Pipelines

  • Shift-Left Governance: Integrating policy checks into CI/CD provides rapid feedback to developers.
  • conftest Integration:
    • GitHub Actions & GitLab CI: Workflows are set up to generate the Terraform plan as JSON and then run conftest test against it using defined Rego policies. Pipeline fails on violations.
  • Scalr VCS Integration: Scalr connects to VCS providers and automatically triggers runs (including OPA policy checks) on commits or pull/merge requests, providing feedback directly in the VCS.

6. Benefits and Approach

  • Key Benefits of PaC: Consistency, scalability, traceability, documentation, rapid remediation, collaboration, enhanced security, compliance, and cost control.
  • Iterative Adoption: Start with simple local policies (conftest), integrate into CI/CD, and then consider platforms like Scalr for more comprehensive, managed governance as needs grow.

7. Future Directions

The field is evolving towards more external data integration in OPA, advanced policy layering, potential mutation policies, AI/ML in policy suggestion/detection, and standardization of policy libraries for common compliance benchmarks.