ML-Based Risk Scoring¶
Overview¶
Malwar includes an ML-based risk scoring model that complements the rule engine by computing a probability that a SKILL.md file is malicious. The ML model uses a logistic regression trained on labeled examples, implemented in pure Python with no external ML library dependencies.
Architecture¶
SKILL.md
|
v
[Feature Extractor] --> 20-dimensional feature vector
|
v
[Risk Scorer] --> P(malicious) in [0.0, 1.0]
|
v
[Calibrator] --> Blended score (weighted avg of rule + ML scores)
Components¶
malwar.ml.features.FeatureExtractor-- Extracts 20 numerical features from aSkillContentobjectmalwar.ml.model.RiskScorer-- Logistic regression model:P(malicious) = sigmoid(X @ weights + bias)malwar.ml.trainer.ModelTrainer-- Trains/retrains the model from labeled skill filesmalwar.ml.calibrator.RiskCalibrator-- Blends ML and rule engine scores
Features Extracted¶
| # | Feature | Description |
|---|---|---|
| 1 | line_count |
Total number of lines in the file |
| 2 | file_size_bytes |
File size in bytes |
| 3 | code_block_count |
Number of fenced code blocks |
| 4 | code_block_ratio |
Ratio of code block characters to total characters |
| 5 | url_count |
Total number of URLs found |
| 6 | external_url_ratio |
URL count normalized by line count |
| 7 | unique_domain_count |
Number of unique domains in URLs |
| 8 | untrusted_domain_ratio |
Fraction of domains not in trusted list |
| 9 | encoded_content_ratio |
Ratio of base64-like content to total |
| 10 | command_pattern_density |
Suspicious command patterns per line |
| 11 | env_var_reference_count |
References to sensitive environment variables |
| 12 | pipe_to_bash_count |
| bash / | sh patterns |
| 13 | prompt_injection_score |
Prompt injection indicator density [0, 1] |
| 14 | content_entropy |
Shannon entropy of the content (bits/char) |
| 15 | section_count |
Number of markdown sections |
| 16 | metadata_completeness |
How complete the YAML frontmatter is [0, 1] |
| 17 | hidden_content_ratio |
HTML comments / total content ratio |
| 18 | exfiltration_pattern_count |
Data exfiltration pattern matches |
| 19 | avg_code_block_length |
Average length of code blocks |
| 20 | hex_escape_density |
Hex escape sequences per body character |
Configuration¶
| Environment Variable | Default | Description |
|---|---|---|
MALWAR_ML_ENABLED |
true |
Enable/disable ML scoring |
MALWAR_ML_WEIGHT |
0.3 |
Weight of ML score in blended result (0.0-1.0) |
When MALWAR_ML_WEIGHT=0.3, the blended score is:
Model Format¶
The model is stored as a JSON file (src/malwar/ml/weights.json) containing:
{
"weights": [w1, w2, ..., w20],
"bias": 0.123,
"feature_means": [m1, m2, ..., m20],
"feature_stds": [s1, s2, ..., s20],
"metadata": {
"version": "1.0.0",
"trained_at": "2026-02-20T...",
"num_features": 20,
"training_samples": 24,
"training_accuracy": 1.0,
"feature_names": ["line_count", ...]
}
}
JSON is used instead of pickle for security -- no arbitrary code execution on model load.
CLI Commands¶
Train/retrain the model¶
malwar ml train
malwar ml train --fixtures-dir /path/to/labeled/skills
malwar ml train --lr 0.5 --epochs 1000 --output /path/to/weights.json
View model info¶
Pipeline Integration¶
When ML scoring is enabled, after all detection layers complete, the pipeline:
- Extracts features from the scanned
SkillContent - Runs the logistic regression model to get
P(malicious) - Stores
ml_risk_scoreon theScanResultobject - Logs the blended score for observability
The ml_risk_score field on ScanResult is optional and does not affect existing rule-based verdicts.
Training¶
The initial model is trained on the test fixture files (5 benign + 19 malicious). To retrain with additional labeled data:
- Place benign
.mdfiles in abenign/subdirectory - Place malicious
.mdfiles in amalicious/subdirectory - Run
malwar ml train --fixtures-dir /path/to/directory
The trainer uses gradient descent with L2 regularization on binary cross-entropy loss. All math is implemented in pure Python -- no numpy or scikit-learn dependency.