Skip to content

Scheduled Scanning

Malwar supports periodic re-scanning of SKILL.md files on configurable cron schedules. This feature enables continuous monitoring of skill files for emerging threats without manual intervention.

Overview

The scheduler runs as an asyncio background task inside the malwar serve process. It checks for due jobs every 30 seconds and executes scans using the same ScanPipeline as manual and API-triggered scans.

Architecture

SchedulerEngine (asyncio loop)
    |
    +-- JobStore (SQLite: scheduled_jobs, job_runs)
    |
    +-- ScanPipeline (rule_engine -> url_crawler -> llm_analyzer -> threat_intel)

Key modules:

Module Description
malwar.scheduler.engine SchedulerEngine class — asyncio background loop
malwar.scheduler.jobs ScanJob and JobRun dataclasses
malwar.scheduler.cron Cron expression parser and next-run calculator
malwar.scheduler.store JobStore for SQLite persistence

Cron Expression Format

The scheduler uses standard 5-field cron expressions:

minute  hour  day  month  weekday

Each field supports:

  • * — any value
  • N — specific value (e.g., 5)
  • N,M — list of values (e.g., 1,15)
  • N-M — range of values (e.g., 9-17)
  • */N — step values (e.g., */15)

Weekday values: 0=Sunday, 1=Monday, ..., 6=Saturday

Examples

Expression Meaning
0 */6 * * * Every 6 hours
*/15 * * * * Every 15 minutes
0 2 * * * Daily at 2:00 AM
0 0 1 * * Monthly on the 1st at midnight
0 9-17 * * 1-5 Hourly during business hours, weekdays only

CLI Usage

Create a scheduled scan

malwar schedule create /path/to/SKILL.md \
  --cron "0 */6 * * *" \
  --name "My 6-hourly scan"

Options:

  • --cron (required) — Cron expression for the schedule
  • --name / -n — Human-readable name
  • --layers — Comma-separated list of detection layers to run
  • --disabled — Create in disabled state

List scheduled jobs

malwar schedule list

Run a job immediately

malwar schedule run <job_id>

API Endpoints

All endpoints are under /api/v1/schedules and require authentication when API keys are configured.

Create a schedule

POST /api/v1/schedules
Content-Type: application/json

{
  "name": "Nightly full scan",
  "target_path": "/skills/production/SKILL.md",
  "schedule": "0 2 * * *",
  "layers": ["rule_engine", "url_crawler", "llm_analyzer", "threat_intel"],
  "enabled": true
}

List all schedules

GET /api/v1/schedules

Get schedule details (includes recent runs)

GET /api/v1/schedules/{job_id}

Update a schedule

PUT /api/v1/schedules/{job_id}
Content-Type: application/json

{
  "schedule": "0 */12 * * *",
  "enabled": false
}

Delete a schedule

DELETE /api/v1/schedules/{job_id}

Trigger immediate run

POST /api/v1/schedules/{job_id}/run

Server Integration

The scheduler starts automatically when malwar serve is called. To disable it:

malwar serve --no-scheduler

This is useful for development or when running multiple worker processes (only one worker should run the scheduler).

Database Schema

The feature adds two tables via migration 005:

scheduled_jobs:

Column Type Description
id TEXT PK Job identifier
name TEXT Human-readable name
target_path TEXT Path to SKILL.md file
schedule TEXT Cron expression
layers TEXT Comma-separated detection layers
enabled INTEGER 1=enabled, 0=disabled
last_run TEXT ISO timestamp of last execution
next_run TEXT ISO timestamp of next scheduled run
created_at TEXT ISO timestamp of creation

job_runs:

Column Type Description
id TEXT PK Run identifier
job_id TEXT FK References scheduled_jobs(id)
scan_id TEXT Associated scan result ID
status TEXT pending, running, completed, failed
verdict TEXT Scan verdict (CLEAN, CAUTION, etc.)
risk_score INTEGER Computed risk score
error TEXT Error message if failed
started_at TEXT ISO timestamp
completed_at TEXT ISO timestamp