Solution Architecture Document — PAN-OS Universal Refactor
Project: PAN-OS Universal Refactor (v6.0) Author: Jarrod E. Brown Status: Working desktop tool Repository: github.com/jarrodebrown/panos-refactor (private)
1. Overview
PAN-OS Universal Refactor is a professional-grade configuration optimization suite for Palo Alto Networks firewalls. It automates the cleanup and consolidation of PAN-OS XML configurations using a phased, "zero-incident" strategy, with automated API-based verification against staging firewalls and audit-ready before/after evidence. It is delivered as a desktop GUI application built in Python with Tkinter.
The tool addresses the universal problem of firewall configuration debt: over years of operation, enterprise PAN-OS deployments accumulate unreferenced address objects, redundant service entries, overlapping rules, and orphaned groups across virtual systems, device groups, and shared contexts. PAN-OS Universal Refactor treats cleanup as a two-phase pipeline — first remove what is unused, then consolidate what remains — with a hard gate between phases and automated validation against a staging firewall before any change is trusted.
2. Problem & Context
Large enterprise firewall configurations accumulate years of unreferenced objects, duplicate rules, and drift across virtual systems, device groups, and shared contexts. A typical enterprise PAN-OS configuration exported as XML can contain thousands of address objects, service objects, address groups, and security rules spread across multiple VSYS partitions or Panorama device groups. Over time, infrastructure changes, personnel turnover, and emergency additions leave behind objects that no active rule references — but nobody removes them because the blast radius of a wrong deletion is a production outage.
Manual cleanup is slow and risky: an engineer must trace every object's usage across Security, NAT, and PBF policies before removing it, and a single mistaken consolidation can drop production traffic. The manual process typically involves exporting the XML, searching for object references with text tools, and making edits with no automated safety net. Network security teams need a way to reduce configuration debt with provable safety, not best-effort hand edits. Audit requirements compound the problem — compliance teams want documented evidence that changes were validated before deployment, not verbal assurance.
3. Goals & Requirements
Functional
- Parse and analyze PAN-OS XML configurations across Security, NAT, and PBF policies in VSYS, Device-Group, and Shared contexts.
- Phase 1 (Cleanup): identify and remove unreferenced objects — address objects, address groups, service objects, and service groups not referenced by any active policy rule.
- Phase 2 (Consolidation): optimize rules and objects by merging duplicates, collapsing overlapping address groups, and simplifying rule structures.
- Validate every modified rule against a staging firewall via the PAN-OS XML API before accepting the change.
- Produce visual before/after audit reports documenting every modification with object-level detail.
- Support PAN-OS 9.x, 10.x, and 11.x XML configuration formats (both firewall and Panorama exports).
Non-functional
- Zero-incident posture: mandatory gating so consolidation cannot run before cleanup.
- Handle configurations up to 50 MB XML / 10,000+ objects without UI freezing.
- Secure XML handling against malformed or hostile input (XXE, billion-laughs).
- Operable by a network engineer without scripting or command-line knowledge.
- Complete Phase 1 analysis of a 5,000-object configuration in under 60 seconds.
4. Decision Rationale
Why Tkinter over Electron or a web UI? The tool targets network security engineers operating in environments where installing Node.js runtimes or running local web servers is restricted by endpoint policy. Tkinter ships with every standard Python installation — no additional runtime, no npm dependencies, no Chromium overhead. A single Python file (or a PyInstaller-built .exe) can be distributed and run on locked-down Windows workstations without admin rights. Electron would have tripled the distribution size (~150 MB vs. ~50 MB for a PyInstaller bundle) and introduced a dependency surface that enterprise security teams would need to vet. The tradeoff is a less polished UI — Tkinter's widget set is functional but dated — but the target users prioritize reliability and portability over visual polish.
Why phased gating instead of a single-pass optimizer? A single-pass approach that simultaneously removes unreferenced objects and consolidates rules creates an ambiguity problem: if an object is unreferenced only because a rule that uses it was merged into another rule during consolidation, is it safe to remove? The two-phase gate eliminates this class of error entirely. Phase 1 operates on the original, unmodified configuration — every reference decision is provably correct against the source of truth. Phase 2 then operates on the cleaned configuration, where the reference graph is simpler and consolidation decisions are unambiguous. This separation also provides a natural checkpoint: the engineer can review Phase 1 results, export a partially-cleaned config, and decide whether to proceed to Phase 2.
Why defusedxml over stdlib ElementTree? PAN-OS XML configurations are exported from production firewalls and may pass through email, shared drives, or ticketing systems before reaching the tool. Any of these transit points could introduce malicious XML payloads (XXE attacks, entity-expansion bombs). The defusedxml library disables external entity resolution and caps entity expansion by default, providing defense-in-depth without requiring the engineer to understand XML security. The tool falls back to stdlib ElementTree only for serialization (writing), where the attack surface is negligible.
Why API validation against staging rather than static analysis alone? Static analysis can verify that object references are internally consistent, but it cannot verify that a consolidated rule will actually match the same traffic as the original rules it replaced. PAN-OS has subtle rule-evaluation behaviors — zone-based forwarding, application-default service handling, negated matches — that are difficult to model outside the firewall itself. By testing each modified rule against a staging firewall via the PAN-OS XML API, the tool leverages the firewall's own rule engine as the oracle, eliminating an entire class of semantic errors that static analysis would miss.
Why a desktop tool instead of a SaaS platform? Firewall configurations contain the complete network topology, IP addressing scheme, and security policy of an organization — they are among the most sensitive artifacts in enterprise IT. Network security teams are reluctant (and often policy-prohibited) to upload these configurations to cloud services. A desktop tool that processes everything locally eliminates data-sovereignty concerns entirely. The configuration never leaves the engineer's workstation.
5. Architecture Overview
The system follows a pipeline architecture with four major layers: the GUI presentation layer, the parsing and analysis layer, the optimization engine (Phase 1 and Phase 2), and the validation and reporting layer. A background thread pool keeps the UI responsive during long-running analysis operations.
The GUI accepts a PAN-OS XML configuration file and exposes controls for each phase. The secure parser loads the XML into an in-memory ElementTree, then the reference engine builds a directed graph mapping every object to the rules that reference it. Phase 1 walks this graph to identify unreferenced objects; Phase 2 applies consolidation strategies to the cleaned configuration. The API validator tests modified rules against a staging firewall, and the reporter generates before/after audit evidence.
6. Components
| # | Component | Module | Responsibility |
|---|---|---|---|
| 1 | GUI (Tkinter) | gui.py |
Desktop front end: file picker for XML config, phase controls with progress bars, treeview for object/rule inspection, tabbed results pane, export buttons for reports and optimized XML. Runs on the main thread; dispatches analysis to worker threads. |
| 2 | Secure Parser | parser.py |
XML parsing via defusedxml.ElementTree (falls back to stdlib xml.etree.ElementTree if defusedxml is unavailable). Validates XML structure, extracts configuration contexts (VSYS, Device-Group, Shared), and builds the in-memory ElementTree. Guards against XXE, entity expansion, and DTD processing attacks. |
| 3 | Reference Engine | reference.py |
Builds a bidirectional reference graph: for every address object, address group, service object, and service group, records which Security, NAT, and PBF rules reference it — and vice versa. Traverses nested groups recursively (an address group containing another address group is resolved to its leaf members). Handles all context scopes: VSYS-local, Device-Group, and Shared objects with inheritance. |
| 4 | Phase 1 — Cleanup | cleanup.py |
Walks the reference graph to identify objects with zero inbound references (no rule or group uses them). Produces a removal manifest — a structured list of unreferenced objects with their type, context, and the evidence that they are unused. Applies removals to the in-memory ElementTree. |
| 5 | Phase 2 — Consolidation | consolidate.py |
Applies optimization strategies to the cleaned configuration: duplicate-object merging (objects with identical member lists), overlapping-group collapse (a group that is a strict subset of another), and rule simplification (adjacent rules with identical actions and overlapping match criteria). Each strategy produces a change record documenting what was merged and why. |
| 6 | Phase Controller | controller.py |
Enforces gating — Phase 2 cannot execute until Phase 1 has completed successfully. Manages phase state transitions (idle → analyzing → review → committed) and exposes the gate status to the GUI. Prevents re-running Phase 1 after Phase 2 has modified the configuration. |
| 7 | API Validator | validator.py |
Connects to a staging PAN-OS firewall via the XML API (/api/?type=op and /api/?type=config). For each modified or consolidated rule, submits a test-policy request to verify that the rule matches expected traffic. Reports pass/fail per rule with the firewall's response detail. Uses requests with TLS verification (configurable for self-signed staging certs). |
| 8 | Reporter | reporter.py |
Generates the before/after audit report: an HTML document listing every object removed (Phase 1) and every rule modified (Phase 2), with side-by-side XML diffs, reference graph excerpts, and validation results. Includes a summary header with counts, timestamps, and configuration metadata. |
7. User Workflow
A typical workflow proceeds through five stages. The tool's GUI guides the engineer through each stage with visual status indicators and prevents skipping ahead.
Stage 1 — Load configuration. The engineer exports the running configuration from a PAN-OS firewall or Panorama as XML (show config running or Export named configuration snapshot). They open PAN-OS Universal Refactor and select the XML file via the file picker. The secure parser validates the XML and populates the context selector (VSYS list, Device-Group list, or Shared). The status bar confirms the parse: object counts by type, rule counts by policy type, and any parse warnings.
Stage 2 — Phase 1 analysis (cleanup). The engineer clicks "Run Phase 1." The reference engine builds the object-to-rule graph and the cleanup module identifies unreferenced objects. Results appear in the treeview: a categorized list of unreferenced address objects, address groups, service objects, and service groups, each showing its context (VSYS/DG/Shared) and the evidence trail (which rules and groups were checked). The engineer can expand any object to inspect its definition and confirm it is truly unused. Removal is not yet applied — this is a review step.
Stage 3 — Commit Phase 1. After review, the engineer clicks "Commit Phase 1." The tool removes all identified unreferenced objects from the in-memory configuration and rebuilds the reference graph. The gate opens: Phase 2 controls become active. The engineer can optionally export the Phase 1-cleaned XML at this point as a checkpoint.
Stage 4 — Phase 2 analysis (consolidation). The engineer clicks "Run Phase 2." The consolidation module applies its optimization strategies and presents proposed changes: merged objects, collapsed groups, and simplified rules. Each proposed change shows a before/after diff and the strategy that produced it. The engineer can accept or reject individual changes before committing.
Stage 5 — Validate and export. After committing Phase 2 changes, the engineer connects to a staging firewall (entering the hostname and API key in the connection dialog) and clicks "Validate." The API validator tests each modified rule and reports pass/fail results inline. Once validation passes, the engineer exports two artifacts: the optimized XML configuration (ready to import into the production firewall) and the HTML audit report documenting every change with validation evidence.
8. Data Flow
- Configuration input. The engineer provides a PAN-OS XML configuration file exported from a firewall or Panorama. File sizes range from a few hundred KB (small branch firewall) to 50+ MB (large Panorama with dozens of device groups).
- Secure parsing. The parser loads the XML via
defusedxml, disabling external entity resolution and DTD processing. It validates the root element structure (<config>with<devices>,<shared>, and optionally<readonly>sections) and extracts configuration contexts. The result is an in-memoryElementTreeplus a context manifest listing available VSYS partitions, device groups, and the shared scope. - Reference graph construction. The reference engine traverses all policy rulebase sections (Security, NAT, PBF) across every context scope. For each rule, it extracts source/destination address references, service references, and group memberships. It builds a bidirectional adjacency map:
object → set(rules)andrule → set(objects). Nested groups are resolved recursively — if AddressGroup-A contains AddressGroup-B, both groups and all leaf address objects are linked to any rule referencing AddressGroup-A. - Phase 1 — unreferenced object identification. The cleanup module queries the reference graph for objects with an empty inbound-reference set (no rule and no parent group references them). It produces a removal manifest: a list of
(object_type, object_name, context, evidence)tuples. After engineer review and commit, the objects are removed from the in-memory ElementTree, and the reference graph is rebuilt. - Phase 2 — consolidation. The consolidation module operates on the post-cleanup configuration. It applies three strategies sequentially: (a) duplicate detection — objects with identical member lists are merged, with all rule references repointed to the surviving object; (b) subset collapse — address groups that are strict subsets of another group in the same context are collapsed into the superset; (c) rule simplification — adjacent rules with identical actions, zones, and profiles that differ only in address or service fields are merged into a single rule with the union of those fields. Each strategy emits a change record.
- API validation. For each rule modified or created by Phase 2, the validator submits a
test security-policy-matchrequest to the staging firewall via the XML API (/api/?type=op). The request includes the rule's match criteria (source zone, destination zone, source address, destination address, application, service). The firewall returns whether the rule would match and which rule in its active policy corresponds. A pass means the consolidated rule covers the same traffic as the original rules it replaced. - Output generation. Two artifacts are produced: the optimized XML configuration (a valid PAN-OS XML file that can be imported via
load config fromor the Panorama commit pipeline) and the HTML audit report.
9. Data Model
The system operates on an in-memory representation of the PAN-OS XML configuration. No persistent database is used — all state lives in the ElementTree and the reference graph for the duration of the session.
Configuration tree — the parsed ElementTree representing the full PAN-OS XML. The tool preserves the original XML structure and namespace; modifications are applied as element insertions, deletions, and attribute updates within the tree. This ensures the exported XML remains valid for PAN-OS import.
Reference graph — a bidirectional adjacency structure implemented as two dictionaries: object_refs: Dict[str, Set[str]] mapping each object's qualified name (type + context + name) to the set of rules that reference it, and rule_refs: Dict[str, Set[str]] mapping each rule to its referenced objects. Group nesting is resolved at build time — the graph represents fully-resolved leaf-level references.
Removal manifest (Phase 1) — a list of unreferenced objects, each recorded as: object type (address, address-group, service, service-group), object name, context scope (VSYS name, device-group name, or "shared"), and the evidence trail (list of rules and groups that were checked and found to not reference this object).
Change records (Phase 2) — a list of consolidation actions, each recorded as: strategy name (duplicate-merge, subset-collapse, rule-simplify), affected objects/rules (before and after), the rationale (e.g., "AddressGroup-A and AddressGroup-B have identical members"), and the XML diff.
Validation results — per-rule pass/fail records from the staging firewall API, including the API response payload, timestamp, and firewall hostname. These are embedded in the audit report.
10. External Interfaces
| Interface | Endpoint / Protocol | Purpose | Auth | Direction |
|---|---|---|---|---|
| PAN-OS XML API (operational) | https://<firewall>/api/?type=op |
Test-policy-match requests for validation | API key (provided by engineer at runtime) | Outbound to staging FW |
| PAN-OS XML API (config) | https://<firewall>/api/?type=config&action=get |
Retrieve running config sections for comparison | API key | Outbound to staging FW |
| PAN-OS XML export | Offline file (running-config.xml) |
Input configuration to analyze | None (local file) | Local read |
| PAN-OS XML import | Offline file (optimized output) | Cleaned/consolidated configuration for import | None (local file) | Local write |
| Audit report | Local HTML file | Before/after evidence for compliance | None (local file) | Local write |
No cloud services or telemetry. The tool makes no outbound network calls except to the staging firewall specified by the engineer. No usage data, configuration content, or analytics are transmitted. All processing is local.
11. Error Handling & Resilience
Malformed XML input. If the configuration file is not valid XML or does not match the expected PAN-OS schema (missing <config> root, no <devices> section), the parser surfaces a descriptive error in the GUI status bar and aborts loading. The engineer can correct the file and retry. Partial parses are not attempted — the configuration is either fully loaded or rejected.
XML security attacks. The defusedxml library blocks XXE (XML External Entity) attacks and entity-expansion bombs (billion-laughs) at parse time. If the stdlib fallback is used (when defusedxml is not installed), the parser manually disables entity resolution via XMLParser(resolve_entities=False). A warning is displayed when running without defusedxml.
Circular group references. The reference engine tracks visited nodes during recursive group resolution. If a circular reference is detected (AddressGroup-A contains AddressGroup-B which contains AddressGroup-A), the cycle is logged, the offending group is flagged in the GUI, and graph construction continues with the cycle broken. Circular groups are reported as anomalies but do not block analysis.
Staging firewall unreachable. If the API validator cannot connect to the staging firewall (network timeout, DNS failure, TLS error), the connection dialog reports the specific error. Validation is optional — the engineer can still export the optimized XML and audit report without validation, with the report noting that API validation was skipped. Self-signed certificates are supported via a configurable TLS-verify toggle.
API key rejected. If the PAN-OS API returns an authentication error, the validator surfaces the firewall's error message and prompts for a corrected API key. The tool does not store API keys to disk.
UI responsiveness under load. All analysis operations (reference graph construction, Phase 1 scanning, Phase 2 consolidation, API validation) run on background threads via Python's threading module. The GUI main thread remains responsive and displays progress updates. A cancel button allows aborting long-running operations.
Phase gate violations. The phase controller enforces state transitions at the code level, not just at the UI level. Even if the GUI is bypassed (e.g., in a scripted/headless mode), the controller raises an exception if Phase 2 methods are called before Phase 1 completion.
12. Reference Graph Algorithm
The reference graph is the core data structure that enables both Phase 1 (identifying unreferenced objects) and Phase 2 (safely consolidating rules). Its construction is deterministic and complete — every object-to-rule relationship in the configuration is captured.
Graph construction (O(R × M) where R = rules, M = avg members per rule):
- Enumerate all objects. Walk the configuration tree and collect every address object, address group, service object, and service group across all context scopes (each VSYS, each device group, and the shared scope). Each object is keyed by its qualified name:
(type, context, name). - Enumerate all rules. Walk every rulebase section — pre-rules, post-rules, and default rules — across Security, NAT, and PBF policy types in every context. Each rule is keyed by
(policy_type, context, rule_name). - Extract references. For each rule, extract the object names referenced in its match fields: source-address, destination-address, source-user (for dynamic address groups), service, and (for NAT) translated fields. Each reference creates an edge in the graph.
- Resolve group nesting. For every address group and service group, recursively resolve its members. If AddressGroup-X contains AddressGroup-Y and AddressGroup-Y contains Address-Z, then a rule referencing AddressGroup-X has transitive edges to AddressGroup-X, AddressGroup-Y, and Address-Z. The resolution uses a depth-first traversal with cycle detection (visited set per traversal path).
- Handle inheritance. In Panorama configurations, device groups inherit objects from parent device groups and from the shared scope. The reference engine resolves references using PAN-OS's inheritance rules: a rule in Device-Group-A referencing "WebServers" first checks Device-Group-A's local objects, then its parent device group, then shared. This ensures inherited objects are correctly linked.
Querying the graph:
- Unreferenced objects (Phase 1):
{obj for obj in all_objects if len(object_refs[obj]) == 0 and not is_member_of_any_group(obj)}— objects with no rule references and no group memberships. - Impact analysis (Phase 2): Before merging two objects, query all rules that reference either object to verify that the merge preserves rule semantics.
13. Consolidation Strategies (Phase 2)
Phase 2 applies three optimization strategies in sequence. Each strategy is conservative — it only proposes changes that are provably semantics-preserving based on the reference graph. The engineer reviews and can reject any proposal before committing.
Strategy 1 — Duplicate object merging. Identifies objects of the same type within the same context scope that have identical definitions (same IP addresses, same CIDR ranges, same port/protocol combinations). When duplicates are found, one is designated as the survivor and all rule references to the others are repointed to the survivor. The duplicates are then removed. This is the safest strategy — identical definitions guarantee identical behavior.
Strategy 2 — Subset group collapse. Identifies address groups where one group's member set is a strict subset of another group in the same context. If Group-A = {X, Y} and Group-B = {X, Y, Z}, and every rule that references Group-A also references Group-B (or could safely reference Group-B without changing match behavior), Group-A can be collapsed into Group-B. This strategy requires careful analysis of rule actions — collapsing is only proposed when the superset substitution does not change the effective policy.
Strategy 3 — Adjacent rule simplification. Identifies pairs of adjacent rules (consecutive in the rulebase) with identical actions, zones, application settings, and security profiles that differ only in their address or service fields. These rules can be merged into a single rule whose address/service fields are the union of the originals. Adjacency is required because PAN-OS evaluates rules top-to-bottom and stops at the first match — merging non-adjacent rules could change match order and alter the effective policy.
14. Audit Report Format
The audit report is a self-contained HTML document designed to satisfy change-management and compliance review requirements. It contains five sections:
Header. Configuration metadata (filename, firewall hostname, PAN-OS version, export timestamp), analysis metadata (tool version, analysis timestamp, engineer name if provided), and summary counts: objects analyzed, objects removed (Phase 1), rules modified (Phase 2), validation results (pass/fail counts).
Phase 1 — Removed Objects. A table listing every unreferenced object removed: object type, object name, context scope, the original XML definition, and the evidence trail (which rules and groups were checked). Objects are grouped by type and sorted by context.
Phase 2 — Consolidated Changes. A table listing every consolidation action: strategy applied, affected objects/rules, before and after XML (rendered as a side-by-side diff with additions highlighted in green and removals in red), and the rationale text generated by the strategy engine.
Validation Results. A table listing every modified rule's validation outcome: rule name, test parameters (zones, addresses, service), staging firewall response (match/no-match, matched rule name), and pass/fail status. If validation was skipped, this section displays a prominent notice.
Appendix — Full Reference Graph. An optional section (toggled by the engineer before export) listing the complete object-to-rule reference map for the final optimized configuration, providing a baseline for future audits.
15. Non-Functional Requirements (Measured)
| NFR | Target | Basis |
|---|---|---|
| Phase 1 analysis time | < 60 seconds for 5,000 objects | Measured on reference hardware (modern laptop, 16 GB RAM) |
| Maximum configuration size | 50 MB XML / 10,000+ objects | Tested with synthetic configurations |
| UI responsiveness during analysis | No GUI freeze > 200 ms | Background threading; progress bar updates at 100 ms intervals |
| Reference graph accuracy | 100% — every object-rule relationship captured | Deterministic graph construction; validated against manual audits |
| Phase gate enforcement | Zero Phase 2 operations before Phase 1 completion | Controller-level enforcement with state assertions |
| Validation coverage | Every modified rule tested against staging | Per-rule API calls; skipped rules flagged in report |
| Audit report completeness | Every removal and modification documented | Report generation fails (hard error) if any change lacks a record |
| Supported PAN-OS versions | 9.x, 10.x, 11.x | Schema variations handled by version-aware parsing |
| Binary distribution size | < 60 MB (Windows .exe) | PyInstaller single-file build |
16. Tech Stack
| Layer | Technology | Role |
|---|---|---|
| Language | Python 3.8+ | Core application language; chosen for ecosystem reach and enterprise compatibility |
| GUI framework | Tkinter (stdlib) | Desktop interface — zero-dependency, ships with Python, runs on locked-down workstations |
| XML parsing (read) | defusedxml |
Secure XML parsing with XXE/entity-expansion protection |
| XML serialization (write) | xml.etree.ElementTree (stdlib) |
XML output; attack surface negligible for write operations |
| HTTP client | requests + urllib3 |
PAN-OS XML API communication with TLS support |
| Concurrency | threading (stdlib) |
Background workers for analysis; keeps GUI responsive |
| Logging | logging + traceback (stdlib) |
Diagnostics and error capture |
| Build / distribution | PyInstaller + GitHub Actions | Automated Windows binary build; single .exe output |
| Development environment | Google Colab notebook | Collaborative development and testing without local Python setup |
| Report generation | Python string templates + HTML/CSS | Self-contained HTML audit reports with inline styles |
17. Security & Compliance
Input hardening. All XML configuration files are treated as untrusted input regardless of source. The defusedxml library is the first line of defense, disabling external entity resolution, DTD processing, and entity expansion. If defusedxml is unavailable, the parser applies manual mitigations and displays a security warning.
No credential persistence. The PAN-OS API key entered for staging validation is held in memory only for the duration of the validation session. It is not written to disk, not logged, not included in the audit report, and not stored in any configuration file.
Local-only processing. Configuration files — which contain the complete network topology, IP addressing, and security policy — never leave the engineer's workstation. No telemetry, no cloud calls, no update checks. The tool is fully air-gap compatible.
Staging-only validation. The API validator connects only to the firewall hostname explicitly provided by the engineer, which is expected to be a staging/lab device. The tool includes no mechanism to push configuration changes to any firewall — it produces an XML file that the engineer imports through the standard PAN-OS workflow (which has its own commit/validation process).
Audit trail. The HTML audit report provides the evidence chain that compliance and change-management processes require: what was changed, why (strategy rationale), and whether it was validated (API results). This supports ITIL change-management workflows and SOX/PCI audit requirements for documented firewall changes.
18. Deployment & Operations
From source. Clone the repository and run python gui.py. Requires Python 3.8+ with defusedxml and requests installed (pip install defusedxml requests). Tkinter is included in standard Python installations on Windows and macOS; on Linux, it may require python3-tk.
Windows binary. A GitHub Actions workflow runs PyInstaller to produce a single-file Windows executable (panos-refactor.exe). The workflow triggers on tagged releases. The binary bundles Python, all dependencies, and the Tkinter runtime — no Python installation required on the target machine.
Google Colab. A Colab notebook is provided for development and testing. It installs dependencies, mounts Google Drive for configuration files, and runs the analysis in headless mode (no GUI). This is useful for rapid iteration and for environments where local Python installation is impractical.
Typical operational flow: export the PAN-OS XML from the firewall → load it into the tool → run Phase 1 → review and commit → run Phase 2 → review and commit → validate against staging → export the optimized XML and audit report → import the optimized XML into the firewall through the standard change-management process.
19. Cross-Project Context
Network Threat Pipeline (shared security domain). The Network Threat Pipeline project processes network threat intelligence from external feeds (abuse.ch, Emerging Threats) and generates PAN-OS External Dynamic Lists (EDLs) that the firewall consumes. While the Threat Pipeline operates on external threat data and PAN-OS Universal Refactor operates on internal configuration, they share a user context: the same network security engineer uses both tools, and a firewall's EDL references appear in the reference graph that PAN-OS Universal Refactor analyzes. Understanding this relationship ensures the refactor tool correctly handles EDL-backed address objects (which are dynamically populated and should not be flagged as "empty" during cleanup).
OFAC Deny List (compliance pattern). The OFAC Deny List project automates compliance-list processing with audit-trail generation — a pattern that PAN-OS Universal Refactor's audit report follows. Both projects share the principle that automated changes to security-critical configurations require documented, reviewable evidence.
20. Risks, Assumptions & Limitations
- Staging fidelity. API validation is only as good as the staging firewall's configuration. If the staging environment does not mirror production's zone structure, NAT rules, or routing, validation results may not reflect production behavior. The engineer is responsible for staging fidelity.
- PAN-OS schema evolution. Future PAN-OS versions may introduce new object types, policy types, or XML schema changes that the parser does not handle. The version-aware parsing layer mitigates this for known versions (9.x–11.x), but new major versions will require parser updates.
- Consolidation conservatism. The consolidation strategies are deliberately conservative — they may miss optimization opportunities that a human expert would recognize (e.g., semantic equivalence between differently-structured rules). This is a design choice: false negatives (missed optimizations) are acceptable; false positives (incorrect merges) are not.
- Desktop distribution. As a desktop tool, distribution and version control are manual. The GitHub Actions build pipeline produces binaries, but deploying updates to engineers' workstations depends on organizational software-distribution processes.
- Group nesting depth. Deeply nested address groups (>10 levels) can slow reference graph construction. This is uncommon in practice but is a theoretical performance concern for pathologically structured configurations.
- Single-firewall scope. The tool analyzes one configuration file at a time. Cross-firewall analysis (e.g., identifying objects duplicated across multiple firewalls in a Panorama-managed estate) is not supported in the current version.
21. Roadmap
Phase 1 — Core cleanup and validation (current — v6.0). Unreferenced object removal, phased gating, API validation against staging, HTML audit reports. Supports Security, NAT, and PBF policies across VSYS, Device-Group, and Shared contexts.
Phase 2 — Advanced consolidation. Expand consolidation strategies: rule shadowing detection (rules that can never match because a higher-priority rule covers all their traffic), application-aware merging (rules that differ only in application lists where the applications share a risk profile), and tag-based grouping (leveraging PAN-OS tags to organize consolidated objects).
Phase 3 — Panorama multi-device intelligence. Cross-device-group analysis: identify objects that are duplicated across device groups and could be promoted to the shared scope. Panorama commit-scope awareness — ensure consolidation respects device-group push boundaries.
Phase 4 — Drift detection. Compare two configuration snapshots (e.g., last-audited vs. current) and highlight what changed, what was added without review, and what was removed. Integrates with the audit report format to produce a change-delta document.
Diagrams: Component Diagram · State Diagram · Data Flow Diagram