Architecture Design

Chapter 4 — Typical system topology, core indicators, device connections, and business logic


4.1 Typical System Introduction

The typical enterprise boundary security topology is organized around a redundant enforcement core that connects upstream ISP circuits to downstream internal zones through a series of controlled inspection points. The topology is designed to eliminate single points of failure at every layer, from ISP connectivity through to the management plane. All traffic flows are logged and forwarded to the central SIEM, and NDR sensors provide passive detection at three key vantage points: internet-inside, DMZ, and east-west core.

The topology distinguishes three major deployment domains: the internet-facing perimeter (ISPs, edge routers, NGFW, DMZ), the cloud connectivity layer (dual tunnels or direct connects to cloud transit hub and centralized cloud firewall endpoints), and the partner connectivity layer (dedicated partner VRF with separate gateway). A separate out-of-band management network connects to all devices and is the only permitted path for administrative access. All logs flow from device collectors to the SIEM, and SOAR integrates with the IdP and ITSM ticketing system for automated response with human approval gates.

Typical Enterprise Boundary Security Topology with Redundancy

Figure 4.1: Typical Enterprise Boundary Security Topology — Dual ISP, dual edge routers, NGFW HA cluster, DMZ zone, user zone, server zone, and management zone with full redundancy

Node Roles and Responsibilities

Node Primary Role Redundancy Model Key Interfaces
Edge Routers (A/B) Routing control, DDoS diversion signaling, route filtering Active/Active with ECMP ISP uplinks, NGFW downlinks, OOB management
NGFW (FW-01/FW-02) Policy enforcement, NAT, IPS, segmentation gateway Active/Standby with state sync Router uplinks, core switch downlinks, DMZ, HA links, OOB
WAF/API Gateway Application-layer security and rate limiting Active/Active pair DMZ switch uplinks, backend app VLAN
Core Switches (A/B) Internal zone switching, VLAN/VRF enforcement MLAG/VSS pair NGFW downlinks, access switches, server zone
NDR/IDS Sensors Passive detection and visibility N+1 sensor coverage TAP aggregation or SPAN ports at 3 vantage points
SIEM/SOAR Correlation and automated response with guardrails Clustered/HA deployment Log collectors, IdP, ITSM, enforcement devices
OOB Management Out-of-band admin access to all devices Dedicated network, no production traffic Console servers, bastion host, NTP, PKI

4.2 Core Functions and Indicators

The following table defines the twelve core performance and security indicators for the boundary security system. Each indicator includes its rationale, implementation path, and acceptance method. These indicators form the basis for the acceptance test plan described in Chapter 10 and should be baselined during initial deployment and monitored continuously during operations.

Indicator Why It Matters Implementation Path Acceptance Method
Zone Isolation Effectiveness Prevents lateral movement between trust zones VRF/VPC + default deny inter-zone rules Path scan + rule audit
Policy Hit Rate Detects unused rules that create risk Logging + analytics on rule counters Monthly report with zero-hit rule review
TLS Inspection Coverage Reduces blind spots in encrypted traffic Selective decrypt policy with privacy review Coverage report + privacy/legal sign-off
WAF Block Efficacy Stops web exploits at the application layer Positive security model + continuous tuning OWASP test results + false positive rate
DDoS Mitigation Time Maintains availability under volumetric attacks Upstream scrubbing with BGP diversion Drill with measured mitigation time
HA Failover Time Ensures continuity during device failures State sync + health check tuning Quarterly failover test with RTO measurement
CPS Headroom Provides resilience during traffic spikes Sizing at 2x peak CPS + tuning Load testing with synthetic traffic
Session Table Headroom Prevents session drops during peak usage Capacity planning at 2x peak concurrent sessions Stress tests with session monitoring
Log Completeness Ensures audit evidence and detection coverage Source onboarding checklist + EPS budgeting Daily SLO checks: expected vs. received sources
Time Sync Drift Enables accurate log correlation across devices Authenticated NTP hierarchy with monitoring Drift report: all devices within 1 second
Configuration Drift Prevents unauthorized or accidental changes Config-as-code with automated drift detection Drift detection alerts + remediation SLA
Change Success Rate Measures stability of change management process Pre-check automation + staged rollout + rollback Change KPIs: success rate, rollback rate, MTTR

4.3 Device Connection Diagram

The physical cabling map defines the exact interface connections between all boundary devices, including redundancy cross-connections, HA dedicated links, LACP/MLAG uplinks, and power distribution. The diagram below shows the dual-rack deployment with color-coded cables for each connection type. Correct physical cabling is essential for achieving the redundancy guarantees specified in the design — any deviation from the cross-connect pattern can create hidden single points of failure.

Physical Cabling Map - Dual Rack Deployment

Figure 4.2: Physical Cabling Map — Dual rack deployment with color-coded cables: yellow (ISP uplinks), blue (data plane), red (HA heartbeat), gray (management); all connections labeled with port numbers

High Availability and Redundancy Design

Figure 4.3: High Availability & Redundancy Architecture — Dual ISP with BGP failover, NGFW Active/Standby with state sync, MLAG core switches, and dual UPS/PDU power paths with RTO/RPO targets

Redundancy Strategy

4.4 Business Logic and Exception Handling

The boundary security system processes traffic through a defined data flow and control flow. Understanding these flows is essential for troubleshooting, capacity planning, and designing SOAR playbooks. The exception handling procedures define how the system responds to three common failure scenarios that affect availability and security posture.

Data and Control Flows

Exception Handling

ISP Failure: BGP routing converges to the remaining ISP within the configured convergence timer (target: ≤ 30 seconds). Ensure that NAT and public IP mapping are planned for single-ISP operation, as some services may have ISP-specific IP dependencies. Monitor for asymmetric routing after failover.
Cloud Link Failure: Traffic fails over to the secondary tunnel or secondary region. Ensure that DNS health checks and application-level health checks are configured to detect the failure and redirect traffic before the tunnel failover completes. Verify that the secondary path has sufficient capacity for full traffic load.
WAF Backend Failure: The WAF must return a controlled error response (503 with custom error page) rather than allowing direct-to-origin bypass. Configure WAF backend health checks with appropriate thresholds and ensure that the error response does not expose internal infrastructure information.