Quick reference
Daily tasks:- Check disk space and table sizes
- Monitor bloat levels (> 50% requires action)
- Verify the database vacuum service is running (if enabled)
- Review autovacuum performance
- Check index usage and bloat
- Analyze high-traffic tables
- Disk usage > 80%
- Table bloat > 100%
- Connection count approaching limit
- Autovacuum hasn’t run in 24+ hours
Database growth monitoring
Prefect stores entities like events, flow runs, task runs, and logs that accumulate over time. Monitor your database regularly to understand growth patterns specific to your usage.Check table sizes
Monitor disk space
Track overall disk usage to prevent outages:events- Automatically generated for all state changes (often the largest table)log- Flow and task run logsflow_runandtask_run- Execution recordsflow_run_stateandtask_run_state- State history
Monitor table bloat
PostgreSQL tables can accumulate “dead tuples” from updates and deletes. Monitor bloat percentage to identify tables needing maintenance:Monitor index bloat
Indexes can also bloat and impact performance:PostgreSQL VACUUM
VACUUM reclaims storage occupied by dead tuples. While PostgreSQL runs autovacuum automatically, you may need manual intervention for heavily updated tables.Manual VACUUM
For tables with high bloat percentages:Monitor autovacuum
Check if autovacuum is keeping up with your workload:Tune autovacuum for Prefect workloads
Depending on your workload, your write patterns may require more aggressive autovacuum settings than defaults:When to take action
Bloat thresholds:- < 20% bloat: Normal, autovacuum should handle
- 20-50% bloat: Monitor closely, consider manual VACUUM
- > 50% bloat: Manual VACUUM recommended
- > 100% bloat: Significant performance impact, urgent action needed
- Autovacuum hasn’t run in > 24 hours on active tables
- Query performance degrading over time
- Disk space usage growing faster than data volume
Database vacuum service
Prefect server includes a built-in database vacuum service that automatically cleans up old data. The service runs as a background process alongside Prefect server and handles deletion of:- Old top-level flow runs that have reached a terminal state (completed, failed, cancelled, or crashed)
- Orphaned logs (logs referencing flow runs that no longer exist)
- Orphaned artifacts (artifacts referencing flow runs that no longer exist)
- Stale artifact collections (collections whose latest artifact has been deleted)
- Old events and event resources past the event retention period
Enable the vacuum service
The vacuum service has two independent components controlled byPREFECT_SERVER_SERVICES_DB_VACUUM_ENABLED:
- Event vacuum (
events): Cleans up old events and event resources. Enabled by default. - Flow run vacuum (
flow_runs): Cleans up old flow runs, orphaned logs, orphaned artifacts, and stale artifact collections. Disabled by default.
prefect server start (single-process mode), background services run automatically. If you use --no-services or --workers > 1, or run a scaled deployment, start background services separately with prefect server services start to ensure the vacuum service runs.
Configure the vacuum service
The following settings control vacuum behavior:| Setting | Default | Description |
|---|---|---|
PREFECT_SERVER_SERVICES_DB_VACUUM_ENABLED | events | Comma-separated set of vacuum types to enable. Valid values: events, flow_runs. |
PREFECT_SERVER_SERVICES_DB_VACUUM_LOOP_SECONDS | 3600 (1 hour) | How often the vacuum cycle runs, in seconds. |
PREFECT_SERVER_SERVICES_DB_VACUUM_RETENTION_PERIOD | 7776000 (90 days) | How old a flow run must be (based on end time) before it is eligible for deletion. Accepts seconds. Must be greater than 1 hour. |
PREFECT_SERVER_SERVICES_DB_VACUUM_BATCH_SIZE | 200 | Number of records to delete per database transaction. |
PREFECT_SERVER_SERVICES_DB_VACUUM_EVENT_RETENTION_OVERRIDES | {"prefect.flow-run.heartbeat": 604800} | Per-event-type retention period overrides in seconds. Event types not listed fall back to PREFECT_EVENTS_RETENTION_PERIOD. Each override is capped by the global events retention period. |
How the vacuum service works
Each vacuum cycle schedules independent cleanup tasks:PREFECT_SERVER_SERVICES_DB_VACUUM_EVENT_RETENTION_OVERRIDES, deletes events and their associated resources older than the configured per-type retention period.PREFECT_EVENTS_RETENTION_PERIOD.The event vacuum only runs when the event persister service is also enabled (
PREFECT_SERVER_SERVICES_EVENT_PERSISTER_ENABLED=true, which is the default). This prevents unexpected data deletion for deployments that have disabled event processing.PREFECT_SERVER_SERVICES_DB_VACUUM_BATCH_SIZE) to avoid long-running transactions that could impact database performance.
Tune the vacuum for your workload
- High-volume deployments (thousands of runs per day): Consider a shorter retention period (for example, 7-14 days) and a more frequent vacuum cycle (every few hours) to prevent data accumulation.
- Low-volume deployments: The defaults (90-day retention, hourly cycle) are appropriate for most cases.
- Large batch size: Increasing
PREFECT_SERVER_SERVICES_DB_VACUUM_BATCH_SIZEspeeds up cleanup but may hold database locks longer. Decrease the batch size if you observe performance impacts during vacuum cycles. - Scaled deployments: If you run separate API and background service processes, ensure the background services pod is running to enable the vacuum service.
Data retention with a custom flow
As an alternative to the built-in vacuum service, you can implement custom retention logic using a Prefect flow. This approach gives you more control over which flow runs to delete and allows you to add custom logic such as notifications or conditional retention.Using the Prefect API ensures proper cleanup of all related data, including logs and artifacts. The API handles cascade deletions and triggers necessary background tasks.
Direct SQL approach
In some cases, you may need to use direct SQL for performance reasons or when the API is unavailable. Be aware that direct deletion bypasses application-level cascade logic and may leave orphaned logs and artifacts:Important considerations
-
Filtering limitation: The custom flow example above filters by
start_time(when the flow run began execution), notcreatedtime (when the flow run was created in the database). This means flows that were created but never started are not deleted by this approach. The built-in vacuum service usesend_timeinstead, so it can clean up runs that reached a terminal state without ever entering a running state. -
Test first: Run with
SELECTinstead ofDELETEto preview what will be removed - Start conservative: Begin with longer retention periods and adjust based on needs
- Monitor performance: Large deletes can impact database performance
- Backup: Always backup before major cleanup operations
Event retention
Events are automatically generated for all state changes in Prefect and can quickly become the largest table in your database. Prefect includes built-in event retention that automatically removes old events.Configure event retention
The default retention period is 7 days. For high-volume deployments running many flow runs per minute, this default can lead to rapid database growth. Consider your workload when setting retention:| Workload | Suggested retention | Rationale |
|---|---|---|
| Low volume (< 100 runs/day) | 7 days (default) | Default is appropriate |
| Medium volume (100-1000 runs/day) | 3-5 days | Balance history with growth |
| High volume (1000+ runs/day) | 1-2 days | Prioritize database performance |
Check event table size
Monitor your event table growth:Events are used for automations and triggers. Ensure your retention period keeps events long enough for your automation needs.
Connection monitoring
Monitor connection usage to prevent exhaustion:Automating database maintenance
Schedule maintenance tasks
Schedule the retention flow to run automatically. See how to create deployments for creating scheduled deployments. For example, you could run the retention flow daily at 2 AM to clean up old flow runs.Recommended maintenance schedule
- Hourly: Monitor disk space and connection count
- Daily: Run retention policies, check bloat levels
- Weekly: Analyze tables, review autovacuum performance
- Monthly: REINDEX heavily used indexes, full database backup
Troubleshooting common issues
”VACUUM is taking forever”
- Check for long-running transactions blocking VACUUM:
- Consider using
pg_repackinstead ofVACUUM FULL - Run during low-traffic periods
”Database is growing despite retention policies”
- Verify event retention is configured:
prefect config view | grep EVENTS_RETENTION - Verify the vacuum service is enabled:
prefect config view | grep DB_VACUUM - Check if autovacuum is running on the events table
- Ensure the vacuum service or retention flow is actually executing (check server logs for “Database vacuum” messages)
“Queries are getting slower over time”
- Update table statistics:
ANALYZE; - Check for missing indexes using
pg_stat_user_tables - Review query plans with
EXPLAIN ANALYZE
”Connection limit reached”
- Implement connection pooling immediately
- Check for connection leaks: connections in ‘idle’ state for hours
- Reduce Prefect worker connection counts