Server Monitoring
Keeping game servers healthy is essential for a good player experience. This page covers the tools and techniques for monitoring your PlayRen game servers.
Health Check Endpoint
Every pool agent exposes a health endpoint:
GET https://gs01.play.ren.bd/api/health
Response:
{
"status": "ok",
"containers": 7,
"cpu_percent": 42.5,
"memory_percent": 58.3,
"disk_percent": 35.2
}Fields
| Field | Description |
|---|---|
status | "ok" if the agent is healthy, "error" otherwise |
containers | Number of currently running game containers |
cpu_percent | Overall server CPU utilization |
memory_percent | Overall server memory utilization |
disk_percent | Disk usage percentage for the game storage volume |
Automated Health Checks
The platform server polls each registered game server's health endpoint periodically. Servers that fail health checks are temporarily removed from the scheduler's rotation until they recover.
You can also set up external monitoring (e.g., UptimeRobot, Pingdom) to alert you if a server goes down.
Container Metrics
Listing Active Containers
GET https://gs01.play.ren.bd/api/containers
Authorization: Bearer {agent-secret}
Returns a list of all running game containers with their resource usage.
Per-Container Stats
When a container is stopped, the pool agent collects its resource usage:
- CPU time — total CPU seconds consumed.
- Peak memory — maximum memory usage during the session.
- Duration — how long the container ran.
These metrics are sent back to the platform and used to update each game's resource profile (exponential moving average), which informs the scheduler's capacity planning.
Docker Stats
For real-time container monitoring, use Docker's built-in stats:
# All containers
docker stats
# Specific container
docker stats container-nameDisk Usage
Game cache is the primary consumer of disk space. Monitor it with:
# Overall disk usage
df -h /srv/games
# Per-game cache sizes
du -sh /srv/games/*Cache Management
If disk usage gets high, you can manually clear cached games that haven't been used recently:
# Check which games are cached
ls -la /srv/games/
# Remove a specific game's cache
rm -rf /srv/games/game-slug/The platform will re-download the game on the next launch request. The GameCacheEntry database record is updated automatically.
Future versions will include automatic LRU cache eviction.
Logs
Pool Agent Logs
The pool agent runs as a systemd service. View logs with:
# Follow live logs
journalctl -u xgame9-agent -f
# Last 100 lines
journalctl -u xgame9-agent -n 100
# Logs from today
journalctl -u xgame9-agent --since todayDocker Container Logs
View logs from a specific game container:
docker logs container-name
docker logs -f container-name # FollowNginx Logs
# Access log
tail -f /var/log/nginx/access.log
# Error log
tail -f /var/log/nginx/error.logCommon Alerts to Set Up
Consider monitoring for these conditions:
| Condition | Threshold | Action |
|---|---|---|
| CPU usage | > 90% sustained | Scale up or limit containers |
| Memory usage | > 85% | Reduce MAX_CONTAINERS |
| Disk usage | > 80% | Clear old game cache |
| Agent down | Health check fails 3x | Investigate and restart |
| Container count | At MAX_CONTAINERS | New launches will queue |
Server Resource Planning
Track these metrics over time to plan capacity:
- Peak concurrent sessions — highest number of simultaneous containers.
- Average session duration — how long players typically play.
- Cache hit rate — percentage of launches that find the game already cached.
- Launch latency — time from launch request to stream connection.
Next Steps
- Configuration — Adjust agent settings and limits.
- Overview — Review the overall architecture.