- Python 100%
| disk_monitor.py | ||
| diskmon-report.service | ||
| diskmon-report.timer | ||
| diskmon-sample.service | ||
| diskmon-sample.timer | ||
| diskmon.env | ||
| README.md | ||
Overview
This project provides lightweight disk monitoring for a VPS with:
-
Daily storage reports
-
Rapid disk usage change detection
-
Alerts sent to a Matrix room
-
Minimal dependencies (Python + SQLite only)
It is designed to be:
-
Simple
-
Transparent
-
Easy to debug
-
Low overhead
Architecture
systemd timers
↓
Python script (disk_monitor.py)
↓
SQLite (local state/history)
↓
Matrix API (alerts + reports)
Components
1. Python Script
Location:
/opt/diskmon/disk_monitor.py
Responsibilities:
-
Collect disk usage stats
-
Store historical samples
-
Detect rapid changes
-
Format messages
-
Send messages to Matrix
2. SQLite Database
Location:
/var/lib/diskmon/diskmon.sqlite3
Purpose:
-
Store disk usage history
-
Track alert cooldowns
3. Environment Config
Location:
/etc/diskmon.env
Contents:
MATRIX_HOMESERVER=https://matrix.yourdomain.com
MATRIX_ROOM_ID=!roomid:yourdomain.com
MATRIX_ACCESS_TOKEN=your_token
DISKMON_DB=/var/lib/diskmon/diskmon.sqlite3
DISKMON_MOUNT=/
4. systemd Timers
Sample Timer (every 5 min)
diskmon-sample.timer
Report Timer (daily)
diskmon-report.timer
Data Flow
Sampling Loop (every 5 minutes)
-
Read disk usage (
shutil.disk_usage) -
Insert sample into SQLite
-
Compare against:
-
10-minute-old sample
-
60-minute-old sample
-
-
Trigger alerts if thresholds exceeded
-
Apply cooldown logic
Daily Report
-
Read current disk usage
-
Format summary
-
Send to Matrix
Database Schema
samples
| column | type | description |
|---|---|---|
| id | int | primary key |
| ts | int | unix timestamp |
| mount | text | mount path |
| used_bytes | int | used disk space |
| avail_bytes | int | free space |
| total_bytes | int | total capacity |
alerts
| column | type | description |
|---|---|---|
| key | text | alert identifier |
| last_sent_ts | int | last time alert was triggered |
Alert Logic
Thresholds
| Condition | Trigger |
|---|---|
| Warning | ≥ 1 GiB increase in 10 minutes |
| Critical | ≥ 10 GiB increase in 60 minutes |
Cooldowns
| Alert Type | Cooldown |
|---|---|
| Warning | 30 minutes |
| Critical | 60 minutes |
Why cooldowns exist
Prevents:
-
Alert spam
-
Repeated messages for same event
-
Noise during sustained writes
Message Formats
Daily Report
[VPS Storage Report]
Mount: /
Used: 48.2 GiB
Available: 131.7 GiB
Total: 180.0 GiB
Usage: 26.8%
Timestamp: 2026-04-01 09:00:00 EDT
Alert
[Storage Alert]
Mount: /
Used space increased by 1.4 GiB in 10 minutes
Previous used: 48.2 GiB
Current used: 49.6 GiB
Timestamp: 2026-04-01 09:40:00 EDT
Monitoring the System
Check timers
systemctl list-timers | grep diskmon
Check logs
Sample job
journalctl -u diskmon-sample.service -f
Report job
journalctl -u diskmon-report.service -f
Run manually
systemctl start diskmon-sample.service
systemctl start diskmon-report.service
Check service status
systemctl status diskmon-sample.service
systemctl status diskmon-report.service
Debugging
1. Environment variables not found
Symptom:
KeyError: MATRIX_HOMESERVER
Fix:
set -a
source /etc/diskmon.env
set +a
2. SQLite errors
Symptom:
sqlite3.OperationalError
Fix:
-
Check SQL syntax
-
Delete DB and recreate if needed:
rm /var/lib/diskmon/diskmon.sqlite3
3. No Matrix messages
Check:
-
correct homeserver URL
-
valid access token
-
correct room ID
-
HTTPS used
4. Script not running
systemctl status diskmon-sample.timer
Testing Alerts
Trigger disk usage spike
fallocate -l 2G /tmp/testfile
Wait ~5–10 minutes.
Cleanup:
rm /tmp/testfile
Maintenance
View database
sqlite3 /var/lib/diskmon/diskmon.sqlite3
Clean old data
Handled automatically:
- keeps ~2 days of samples
Extending the System
Possible improvements
-
Monitor multiple mounts
-
Add low disk space alerts (e.g. <20GB)
-
Send HTML-formatted Matrix messages
-
Integrate with Uptime Kuma push monitor
-
Add inode monitoring
-
Add disk I/O rate tracking
Design Philosophy
This system intentionally avoids:
-
Prometheus
-
external monitoring stacks
-
heavy dependencies
Instead it focuses on:
-
clarity
-
reliability
-
minimalism
-
full control over alert logic
Summary
This setup provides:
-
Continuous disk monitoring
-
Time-window-based change detection
-
Daily reporting
-
Matrix integration
-
Minimal operational overhead
All in ~1 script + systemd.